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Abstract 

The science of complex networks is a new interdisciplinary branch of science which has arisen recently 
on the interface of physics, biology, social and computer sciences, and others. Its main goal is to discover 
general laws governing the creation and growth as well as processes taking place on networks, like e.g. the 
Internet, transportation or neural networks. It turned out that most real-world networks cannot be simply 
reduced to a compound of some individual components. Fortunately, the statistical mechanics, being one 
of pillars of modern physics, provides us with a very powerful set of tools and methods for describing and 
understanding these systems. In this thesis, we would like to present a consistent approach to complex 
networks based on statistical mechanics, with the central role played by the concept of statistical ensemble 
of networks. We show how to construct such a theory and present some practical problems where it can 
be applied. Among them, we pay attention to the problem of finite-size corrections and the dynamics of 
a simple model of mass transport on networks. In particular, we calculate the cutoff function for finite 
growing networks in the generalized Barabasi- Albert model and show how the maximal degree observed 
in such a network depends on its size and on the exponent 7 in the power-law degree distribution. We 
show that this structural cutoff is gaussian only for 7 = 3, and is never exponential for 2 < 7 < 4. In 
parallel, we present numerical results for equilibrated networks, that is networks obtained in a sort of 
"thermalization" (randomization) process. We discuss also similarities and differences between growing 
and equilibrated networks. Concerning dynamics on networks, we study so called zero-range process 
being a system of particles hopping between sites of a network. We discuss known results for its static 
and dynamical properties on homogeneous networks, where all nodes have the same degrees, and derive 
new predictions for inhomogeneous graphs. We show that when the density of particles passes a certain 
threshold, a condensate emerges at the most inhomogeneous node. Its life-time grows exponentially with 
the size of the system, contrary to homogeneous graphs where it grows only like a power law. We find 
also a special type of an inhomogeneous network, for which the average distribution of balls is scale-free 
at the critical point. 



Preface 



This is a slightly modified text of the PhD thesis written as a part of the author's PhD studies in theoretical 
physics under the supervision of Prof. Z. Burda, and defended on April 5th, 2007, at the Faculty 
of Physics, Astronomy and Applied Computer Science, Jagellonian University in Cracow, Poland. In 
comparison to the officially accepted doctoral dissertation, available from Jagellonian University Library, 
this version has been changed according to some critical remarks of referees and other people who read it 
before and after the defense. In particular, some typos and errors in formulas have been corrected, and 
some references added or updated. There are also some minor changes. For instance, in the original text 
we used the word "homogeneous" to refer to a certain type of networks. We replaced it here by the word 
"equilibrated" which, as we realized, better relfects the structure of these networks and does not lead to 
a confusion with another, commonly accepted meaning. The contents is, however, almost unchanged, so 
is the order of all chapters, sections etc. 
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Chapter 1 

Introduction 



1.1 Science of complex networks 

We live in the world dominated by networks, either in technological or social sense. Who can now 
imagine our existence without electric power transmission lines, organized in a kind of network with 
nodes being power plants or transformer substations, or without the Internet, the most powerful medium 
of the 21th century? In fact, networks surround us. We ourselves are also a part of a huge network 
of interpersonal contacts, where ideas or diseases can spread. Highways, subways, air traffic as well as 
scientific collaborations or sexual contacts' networks are just a few further examples. Some of them are real 
physical networks (the Internet), some of them describe non-physical relations between objects (the World 
Wide Web), being defined in some abstract space. During the last decade, networks became a subject of 
interest of scientists who want to discover general laws governing their formation and growth. It is a great 
success that despite an enormous variety of networks and essential differences in their physical structure, 
it is possible to find such laws, applying to the majority of real-world networks. The most important 
observation is that these networks are complex, what means that their properties cannot be simply 
reduced to a compound of individual components. Instead, a new quality emerges when many objects 
are linked together forming a network. Therefore, the reductionism - a powerful tool of physics - fails 
when one tries to examine complex networks. Fortunately, one branch of physics, namely the statistical 
mechanics, provides us with an ideal set of tools, methods and ideas for describing and understanding 
these sophisticated systems. The application of these ideas to complex networks uncovers unexpected 
connections to other areas of physics, as for instance to percolation or Bose-Einstein condensation. 

In recent years, many properties of real-world networks have been described. Many models have been 
proposed. As a result, a new inter-disciplinary science, the science of complex networks, has emerged on 
the interface of physics, chemistry, biology, computer science and other disciplines. It is not the intention 
of author to review all important results of the science of complex network in this short introductory 
chapter. For a review, we refer the reader to excellent papers [TJ O [3], or to a newer one [4] presenting 
also some recent developments in the field. However, to give a better comprehension of results presented 
in this thesis and to make it self-contained we shall describe some ideas which are especially important for 
our purposes. So in the next section of this chapter we shall discuss some basic concepts of graph theory, 
which provides a natural framework for description of networks. Then in the subsequent section we shall 
recall the empirical findings on real- world networks which have motivated the outbreak of interest in the 
field and then the rapid development of the science of complex networks in recent years. The explanation 
of the observed real-world properties is still the main objective of many scientific publications. In the 
last section of this chapter we shall briefly discuss the aim and the scope of the thesis. 

1.2 Graphs as models of networks 

The material presented in this section is intended to give the reader a brief introduction to the notation 
and some basic concepts developed by mathematicians in graph theory and then widely accepted by the 
community of complex networks. All the definitions given here and also many others can be found for 
example in the book [5]. The reader familiar with graph theory may skip this section. 

It is probably a trivial statement that a network can be represented as a graph, a mathematical object 
consisting of a set of nodes (called also vertices or sites) and a set of edges (links) , which are related by 
incidence relations. The nodes are joined by edges and the whole object is usually represented graphically 



1 



Figure 1.1: Left: an example of an undirected pseudograph with seven nodes and eight edges. The nodes 
are labeled for convenience. The node 7 is disconnected from the main body. The node 3 has a loop that is 
a self-connecting link. There is a double-link between the nodes 4 and 5. If the graph were simple, it would 
have neither multiple- nor self-connections. The degrees of all nodes are 1, 3, 5, 4, 2, 1, 0, respectively. In 
addition, there is a triangle on the nodes 2,3,4. Right: an example of directed pseudograph with 
AT = 6, £ = 10. 

as in Fig. 11.11 For each edge the incidence relation says which nodes are its endpoints. In the thesis we 
shall denote the total number of nodes and links in the graph by N and L, respectively. While referring 
to graph's size we shall usually mean the number of nodes. Nodes shall be denoted by small Latin letters 
i,j,.... For simple graphs (see below), each link is uniquely determined by a pair of nodes being 
its endpoints. 

For many purposes it is convenient to differentiate between a directed graph, where every link i — ► j 
points only in one direction, and an undirected graph for which links do not have orientation. In Fig. 11.11 
we show examples of an undirected and a directed graph. Not every edge must connect distinct vertices. 
An edge which has two identical end-points is called a loop or a self-connection. If two nodes are 
connected by more than one link, the corresponding links are called a multiple-connection or multiple- 
links. One is often interested in graphs without self- and multiple-connections, which are called simple 
graphs or sometimes Mayer graphs, in contrast to graphs with self- or multiple-connections which are 
called pseudographs or degenerate graphs. In the course of this work we will see however that in some 
respects pseudographs are more convenient for analytical treatment. A graph is fully described by its 
adjacency matrix A, whose entries Aij give the number of edges between nodes In this thesis we shall 
mainly consider undirected graphs, for which A is symmetric: Aij — Aji. Because each self-connection 
can be viewed as two links: one going out and one going in, it is convenient to define diagonal elements 
of A to be equal to twice the number of loops incident with the node: An = 0, 2, 4, . . . . Alternatively 
the factor of two for the diagonal elements can be attributed to the fact that each loop is incident with 
the vertex two times. Of course for simple graphs, all diagonal elements vanish: An = and off-diagonal 
A^ are either zero or one. 

The most important local quantity characterizing a graph is node degree. The degree fej of node i is 
just the number of links incident with the node: h — J^j Aij. In case of directed graphs one can define 
the out- and in- degree separately, for outgoing and incoming links. For a simple undirected graph, the 
node degree is equal to the number of nearest neighbors of the given node, that is nodes linked to it by 
an edge. The average degree k of a graph is the average number of links per one node, that is k = 2L/N, 
because each link is counted twice in the sum J^i = We shall use the notation k when N, L are 
fixed, as for instance for the given network, or (k) when N or L may fluctuate, as for instance for networks 
in the given statistical ensemble. 

A graph is said to be dense if the average degree is of order O(N) for N — > oo or to be sparse if k 
approaches a constant in this limit. A special example of a dense graph is a complete graph for which 
every pair of nodes is connected by an edge, and thus L — N(N — l)/2 and k = N — 1, and of a sparse 
graph is an empty graph with L — and k = 0. There are more special graphs having their own names, 
some of which will be mentioned in the next chapters. 

A subgraph is a graph defined on a subset of nodes which are connected by links preserving the 
incidence relation of the whole graph. The simplest subgraphs are a line (a single edge joining two nodes) 
or a triangle: three nodes joined together by three links, see Fig. 11.11 Small subgraphs are called motifs 
in the language of complex networks and will be discussed later. 

A path joining nodes i\ and i n is a set of all nodes it, . . . ,i n , where all intermediate nodes are distinct 
and every pair ik,ik+i is connected by a link. In other words, it is a walk which starts from it, ends 
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Figure 1.2: A tree graph with N = 8 nodes and L — 7 edges. Note that for any tree L = N — 1. 

in i n and goes along links through the network, visiting each node no more than once. The length of a 
path is just its number of links. A shortest path (there may be more than one) between a pair of nodes 
is called geodesic path, and the length of this path is called geodesic distance. The longest geodesic from 
all possible paths is called a diameter of graph. The average geodesic distance is sometimes also called 
diameter, but strictly speaking it is quite a distinct quantity. In this paper we shall however use the 
latter definition since it is often much simpler to calculate, and moreover for graphs representing complex 
networks these two quantities are strongly correlated. A graph is said to be connected if every two nodes 
can be connected by a path. Each subgraph built on all vertices which can be connected by a path is 
called connected component, or just component, of the graph. When the size of a component scales as 
O(N) it is called a giant component. The graph on the left-hand side of Fig. 11.11 has two components: 
one has six nodes and the other only one, namely the node 7. 

A close path is called cycle. The simplest cycle is the triangle graph. A connected graph with no 
cycles is called tree (Fig. I1.2p . Trees play an important role because on the one hand many models of 
complex networks can be exactly solved for trees and on the other hand some important classes of graphs 
locally look like trees. 

1.3 Properties of real- world networks 

All definitions presented in the previous section have been developed by mathematicians long before the 
science of complex networks received its name and became popular between scientists working in different 
disciplines. In this section we shall present some of new ideas which have emerged recently, mostly in the 
last decade, as a result of empirical studies of real-world networks. Some of them have been introduced 
not as well-defined mathematical concepts but rather as "operative" definitions which captured interesting 
properties of investigated networks. 

Since the works of Milgram, Albert, Barabasi, Watts, Strogatz, and many others, three concepts have 
occupied an important place in the science of complex networks. These are power-law (or more generally: 
heavy tailed) degree distributions, the concept of small- world and the clustering. We shall discuss them 
shortly and describe how they apply to some real- world networks. All quantities and definitions shall be 
given for undirected networks if it is not stated otherwise. 

Degree distribution. Like we said, node's degree is the number of links connected to that node. 
Let us define now the probability II(fe) that a randomly chosen node has exactly k links. Tl(k) is called 
the degree distribution and can be obtained for any given network by making a histogram of the degrees 
for all nodes. By definition, the degree distribution is normalized: J2k = 1 and its mean J2k kH(k) 
equals to the average degree k. Investigations of real networks have led to a surprising result that many 
of them have a power-law tail in the degree distribution: 

n(fc) - A: -7 , (1.1) 

for intermediate values of 1 -C k <C N where N is the number of nodes in the network The value 
of 7 is typically between 2 and 4. This differs crucially from what one can imagine either for purely 
random networks, or for regular grids like square or cubic lattices. In the case of regular lattices, all 
degrees are the same, so II(fc) = S k g, while for random graphs one can argue that since edges are placed 
randomly, the distribution II(fc) should be close to a Poissonian one centered around k. The network 
with a power-law degree distribution is called scale-free network (S-F), to emphasize the fact that there 
is no typical scale in the power-law describing the tail of the node degree distribution. Many models have 
been proposed to explain this feature, some of them will be presented later. 

Another quantity related to degrees is a two-point function e(k, q) giving the probability that a 
randomly chosen edge joins two nodes of degrees k and q. The values e{k,q) form a symmetric matrix: 

1 For k of order N there is always some correction, see the next chapter. 
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^,g)= T W - (1-4) 



Figure 1.3: The assortativity .4 of the network and the average degree of nearest neighbors k nn (k). For 
an uncorrelated network, the assortativity is zero and fc nn (fc) is a horizontal line. For correlated networks, 
two scenarios are possible: either A < if the network is disassortative, or A > if it is assortative. 

e(k,q) = e{q,k). The function has the following properties: 

J2<k,q) = 1, (1.2) 

k<q 

Y / t(k,q)=J2<<l,k) = kU(k)/k. (1.3) 

9 9 

The last equality becomes obvious when one realizes that the sum over q gives the probability that a 
randomly chosen edge incidents on a vertex with degree k. But the fraction of such edges is just fcn(fc) 
and the division over k gives the correct probabilistic interpretation. If there were no correlations, the 
probability e(fc, q) would factorize: 

kU(k) qU(q) 
~P 

but for almost all networks t(k,q) ^ e I (k,q). The two-point function is, however, not convenient for 
examining degree-degree correlations, therefore another alternative quantities based on e(k,q) have been 
introduced, as for instance an average degree fc nn (fc) of nearest neighbors of a node with degree k. It can 
be expressed through the two-point correlations as follows: 

^ n(fc) = fcr|fc)£ 9e(M) - (L5) 

In Fig. 11.31 we sketch three possible behaviors of the correlations in the network, studied by means of 
k nn (k). When this quantity grows with k, it means that the higher is degree of a node, the higher is 
average degree of its neighbors. In order to describe this behavior one uses the term "assortativity" which 
is borrowed from social sciences. If k nn (k) decreases with k, the network is said to be disassortative. 
One can easily show that in case of uncorrelated degrees {Q} , the average degree of nearest neighbors is 
constant (horizontal line in Fig. 11.30 . One can go further and reduce assortativity to a single coefficient 

This quantity is equal 1 for a totally assortative network: e(fc, k) > 0, e(k, q) = for k ^ q, and is negative 
for disassortative networks. In the paper [7] a slightly different quantity, namely the Pearson correlation 
coefficient, was measured for real networks. It was found that artificial networks like the WWW or 
the Internet are mostly disassortative, while the citation network or other networks describing relations 
between human beings are rather assortative. 

Small-worlds. The most popular manifestation of the small-world effect is the "Six degrees of 
separation" being also the title of S. Milgram's book. He found that a typical distance in the network of 
acquaintance among people in the USA is about six. In the language of graph theory, this means that the 
average path length is six. If relationships between people formed a regular, two-dimensional grid, then 
for N — 3 x 10 8 people the average distance (I) would be of order 10 4 . The experiment made by Milgram 
showed that (I) grows rather as <~ In AT. More generally, one speaks about a small- world network when 
the typical distance or the diameter grows like logarithm of the system size. It is different from the case 
of a regular lattice in d dimensions where 

(l)~N 1/d , (1.7) 
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but it agrees well with simple models of random graphs. Indeed, one can estimate the number of nodes 
of a random graph at distance I to some particular node as k . This has to be equal to N for I being 
the diameter and hence (I) ~ In AT. If one defines a fractal dimension of the network as d from Eq. (|l,7p . 
one gets d = oo for a small-world. We will see later that random graphs of a special type, namely 
homogeneous random trees (Section 2.3) do not need to be small-worlds, thus randomness per se is not 
a sufficient condition to trigger the effect. 

Clustering. This is a common property of many social networks which describes the tendency to 
form cliques of acquaintances. It is a rule that friends of our friends are often also our friends. In the 
language of graphs this means that there are many triangles in the network. Two measures of clustering 
are most popular. The first one is a global measure or a clustering coefficient C: 

q 3 x number of triangles 
number of connected triples ' 

where a connected triple is a subgraph consisting of three nodes with at least two links between them. 
For a complete graph, all nodes are connected and thus C = 1 which agrees with the intuition that 
the complete graph forms a big interconnected clique. For trees, C — because of the absence of any 
cycles (and hence also triangles) which is also intuitively comprehensible. For any other networks, C lies 
somewhere between and 1. Another definition of the clustering coefficient is based on local properties 
of nodes. Let i be a node with degree hi and Ci be the number of edges existing between the neighbors 
of i, or in other words, the number of triangles having one vertex at i. Then we define a local clustering 
coefficient: 

which is one if all neighbors of the node i are connected. The clustering coefficient for the whole network 
is the average of all CVs. Both definitions of C are qualitatively consistent and give roughly the same 
values for real networks, being rather high (typically C > 0.1) in comparison to random graphs of the 
same size where C ~ 1/N (see Section 2.1.1). 

After this introduction to the most interested observables on complex networks, let us shortly discuss 
some examples of real- world networks. Let us begin with the World Wide Web (WWW), which represents 
the largest network for which information about topology is currently available. The nodes are web 
pages and the edges are hyperlinks pointing from one page to another. This kind of structure can be 
represented by a directed graph. It is, however, possible to consider undirected networks where nodes are 
self-contained collections of web pages and the undirected link is formed if there is any hyperlink between 
pages belonging to different sites. The Web was probably the first network, for which the power-law 
degree distribution H(k) was discovered by Albert and Barabasi, and Kumar et. al. [HIE]- The total 
number of nodes in the WWW is of order several billion^!, but until now it was impossible to search the 
whole network. For a subset of about A" = 300, 000 nodes the exponents ji n and j ou t in power laws for in- 
and out-degree distribution were estimated to be 2.1 and 2.45, respectively. Later these estimations have 
been corrected to 2.1 and 2.72, where both distributions were collected on a network with 200 millions 
of documents [10]. The power law is observed in a range of k covering five orders of magnitude. The 
network considered in [10] had the average degree k — 7.5, and the average path length around 16, which 
agrees with the conception of small worlds since In A" ss 19. The clustering coefficient has been found for 
another subset of the WWW [IT], with 1-degree nodes excluded and the size A" w 150, 000, to be about 
0.11, being much larger than for randomized graph of the same size. 

In contrast to the WWW, the Internet is a physical network of computers (nodes) and wire- or 
wireless connections (edges). For the Internet treated as undirected network, the power law in the degree 
distribution has been found to hold over three orders of magnitude with 7 being in the range 2.1 — 2.5 
[j"2l 113). Also the high clustering and small- world behavior have been confirmed, indicating a similarity 
of the Internet to the WWW. 

There is also a large class of so called social networks. These are networks describing relationships be- 
tween humans, either based on physical interpersonal contacts (movie or science collaboration networks, 
human sexual contacts) or non-physical like the citation network being in fact the graph of citation pat- 
terns of scientific publications (for a review see e.g. [US]). These networks share also some common 
features. All of them are scale-free with the exponent 7 varying between 2.1 for some scientific collab- 
orations to 3.5 for the web of sexual contacts. Also the clustering is much larger than for an analogous 
random graph. The very interesting case happens for the citation network of papers published in Phys. 



2 In 2004 Google stated that they indexed over 4,000,000,000 web pages. It is, however, difficult to define, what is a single 
web page, therefore their number varies by two order of magnitude from definition to definition. Moreover, the WWW 
grows very quickly, making all precise estimations meaningless. 
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Rev. D [14], where 7 seems to be very close to three, indicating that it can evolve due to the preferential 
attachment [15] described below in Section 2.2.1. 

The last class of networks we want to mention are various biological networks. For instance, one can 
study the metabolism of a living cell and construct a graph with nodes being chemical reagents and links 
denoting possible chemical reactions. Again, for a network like that [16] the usual behavior has been 
found: the power-law degree distribution, high clustering and small diameter. Another type of networks, 
showing possible bindings between proteins [17], exhibits the power-law behavior with 7 = 2.4 but with 
an exponential cutoff above k w 20. 

There are many other examples of networks: ecological, neural or linguistic ones or a network of call 
phones etc., which have not been cited here. We decided to skip them because the examples presented 
above are already a good sample of what one can find in real networks. The inquiring reader is referred 
to the review articles given in the first section of this chapter. 

1.4 The aims and the scope of the thesis 

Although many models have been proposed to capture various properties of complex networks, there 
is a relatively small number of papers which aim on formulating a sort of general theory of complex 
networks from physicist's point of view. In such a theory one is interested in having a general framework 
for modeling, calculating, computing or estimating quantities of interest and explaining the existing 
facts rather than in formulating general theorems or finding formal proofs of statements with idealized 
assumptions. Of course these two directions should be developed in parallel, since they are complementary. 
Here we shall concentrate on the former one, that is on practical aspects and on a physical theory of 
complex networks. The latter direction is covered by the mathematical literature on random graphs. 

Therefore, the aim of this thesis is to present a theory of complex networks based on statistical 
mechanics, where the central role is played by the concept of statistical ensemble of graphs. This approach 
to complex networks has been continuously developed over the past ten years by many people, including 
D. ben-Avraham, M. Bauer, J. Berg, D. Bernard, P. Bialas, G. Bianconi, P. Blanchard, M. Boguna, Z. 
Burda, G. Caldarelli, J. D. Correia, I. Derenyi, S. N. Dorogovtsev, I. Farkas, A. Fraczak, K.-I. Goh, A. V. 
Goltsev, S. Havlin, J. A. Holyst, B. Kahng, D. Kim, P. Krapivsky, T. Kriiger, A. Krzywicki, M. Lassig, 
D.-S. Lee, J. F. F. Mendes, M. E. J. Newman, G. Palla, J. Park, R. Pastor-Satorras, A. M. Povolotsky, 
S. Redner, A. N. Samukhin, T. Vicsek, and many others. In the thesis we discuss ideas and methods 
developed in this approach as well as a variety of results obtained within this framework. In this general 
context we present the original contribution of the author. It is partially based on yet unpublished work. 
The remaining part of the thesis is divided into three chapters, subdivided into sections. Each chapter 
and each section begin with a short introduction and a summary of most important results derived there. 

Chapter 2 is devoted to a general presentation of statistical mechanics of networks. First, the most 
important models are described and then they are formulated in terms of statistical physics. The approach 
via statistical ensemble of random graphs is developed and it is shown how to design a very general Monte 
Carlo algorithm, suitable for generating various networks on a computer. Also the rate equation approach 
is presented since it is well suited for growing networks, being the vast part of proposed models. The 
chapter is ended with a section on the comparison between growing networks and networks obtained by 
a process of "thermalization" (or "homogenization") by rewiring links. 

Chapter 3 deals with applications of these ideas to complex networks. First we discuss finite-size 
corrections to power-law degree distributions. Such corrections are always present for finite networks and 
may significantly affect actual properties of the network. We develop an analytical method to evaluate 
the corrections and present results of this evaluation for some S-F networks. We compare the analytic 
results with numerical simulations. In the subsequent section of this chapter we discuss dynamics taking 
place on networks. We consider a special model called zero-range process. The application of the zero- 
range process to the description of many important phenomena like mass transport or condensation in 
homogeneous systems has been widely discussed in the literature. We concentrate here on the behavior 
of the zero-range process on complex networks, where inhomogeneity in nodes degrees plays an important 
role. After a preliminary discussion of the model we present our findings concerning static and dynamical 
properties of the process on inhomogeneous networks. 

Chapter 4 contains conclusions and outlook. 
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Chapter 2 

Statistical mechanics of networks 



For a long time networks were studied by mathematicians as a part of graph theory. In recent years it 
has been discovered that many concepts and methods of statistical physics can be successfully applied to 
description of complex networks, and many papers have been dedicated to the problem of formulating 
principles of statistical mechanics of networks (see e.g. p] U8J 032 EQl HI])- In this chapter we shall 
present some of these ideas. Although all of them originate from the same statistical physics, the nature 
of complex networks leads to distinguishing two classes of methods: those for networks being in a sort of 
equilibrium to which the concept of statistical ensembles naturally applies, and those for non-equilibrium 
networks which are most naturally formulated within the rate-equation approach. We shall refer to the 
two classes of networks as to equilibrated and growing (causal) networks, respectively. While the 
second term is commonly accepted in this context, networks belonging to the first class are sometimes 
called homogeneous networks [22] or maximal entropy random networks [231. Here we shall use the term 
"equilibrated" since it resembles the way how these networks are generated_|. 

We have split this chapter into three sections. In the first section we discuss equilibrated networks. 
We first present the most famous examples of networks of that type. Then we show how to formulate 
a consistent theory of statistical ensembles of these networks starting from the simplest construction of 
Erdos-Renyi random graphs. We show that ascribing non-trivial statistical weights to graphs from this 
set we can produce networks with any desired features, as for instance networks having the power-law 
degree distribution, high clustering, degree-degree correlations etc. We present also a dynamical Monte 
Carlo algorithm, based on a construction of Markov chains, which allows one to generate equilibrated 
graphs. 

At the beginning of the second section we show some famous examples of growing networks. These 
networks are generated by a growth process in which new nodes are attached to the existing network, 
so by the construction nodes are causally ordered in time. Therefore the words "growing" and "causal" 
are used to name these networks. Their growing character explains why the rate equation approach is 
so successful in this field. We will see, however, that causal networks can also be described within the 
formulation via statistical ensembles which in some cases is even more convenient. 

In the third section we present differences and similarities between equilibrated and growing complex 
networks. The causality manifests itself as a very strong constraint that selects a subset of networks from 
the corresponding set of equilibrated graphs. In effect, "typical" networks in this subset usually have quite 
different properties than "typical" networks in the whole ensemble, even if networks in the two ensembles 
have identical statistical weights. 

The ideas presented in this chapter have been introduced earlier by many people. They are scattered 
in many papers and used in different contexts. Here we want to collect them and comment on their 
applicability to some problems in the theory of complex networks. This shall form a basis for the 
considerations presented in the next chapter, where some applications will be discussed. 

2.1 Equilibrated networks 

As mentioned, we shall use the term "equilibrated networks" to refer to networks which are closely 
related to maximally random graphs. Although they can be constructed by many different methods, 
their common feature is that nodes, even if labeled, cannot be distinguished by any other attribute. For 

1 In our earlier papers we often called these networks "homogeneous". In this thesis we shall reserve this word for networks 
with all nodes having the same degree, like fc-regular graphs, with all degrees being equal to k. This is also the most popular 
meaning in the literature. 
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example, they may not be causally ordered. This means that if one generates a labeled network and 
repeats the process of generation many times, each node will have statistically the same properties as 
every other node, e.g. it will have on average the same number of neighbors, the same local clustering 
etc. We stress here the meaning of the phrase "statistically the same", which means that it does not 
make sense to speak about a single network, but rather about a set of networks, similarly as one speaks 
about the set of states in classical or quantum physics. In this way a statistical ensemble of graphs 
naturally arises as a tool for studying "typical" properties of networks. As we shall see later, equilibrated 
networks belonging to the given ensemble are in a sort of thermodynamical equilibrium, however it is not 
an equilibrium in the sense of classical thermodynamics, where the statistical weight of a state is given by 
the Gibbs measure: ~ exp(—@E), with E being the energy of the state. In the case of complex networks 
it is convenient to abandon the concept of energy and Gibbs measure and consider a more general form of 
statistical weights. Therefore such a concept like temperature is often meaningless, although there were 
some attempts to define this quantity for networks |i24j . 

Before we define a statistical ensemble of equilibrated networks, we shall present some examples of 
networks belonging to this class. They were introduced over the past 50 years. One of them, known as 
Erdos-Renyi model (ER model), is a pure mathematical construction. Much is known and can be proved 
rigorously for that model. In this respect, the ER model is exceptional since other models invented to 
mimic some features of real networks have not been studied so thoroughly and many results are not so 
rigorous. After a short presentation of the ER model we shall show how to change statistical properties 
of typical graphs by introducing an additional weight to every graph in the ER ensemble. The resulting 
ensemble of equilibrated graphs can be flexibly modeled by choosing appropriate weights. For instance, 
we shall see how to obtain a power-law degree distribution, or how to introduce degree-degree correlations. 
Towards the end of the section we shall present a quite general Monte Carlo algorithm for generating 
such equilibrated weighted graphs. 

2.1.1 Examples of equilibrated networks 

As a first example of a network model belonging to the class of equilibrated networks we shall describe the 
Erdos-Renyi model. In their classical papers [25] in 1950s Erdos and Renyi proposed to study a graph 
obtained from linking N nodes by L edges, chosen uniformly from all ( 2 ) = N(N — l)/2 possibilities. 
In this thesis we shall often refer to it as a maximally random graph since it is totally random, that is 
edges are dropped on pairs of nodes regardless of how many links the nodes have already got. The only 
constraint is that one cannot connect any pair of nodes by more than one link, so the ER graph is a 
simple graph. The graph can be constructed in an alternative way by random rewirings. This will be 
discussed later in section 2.1.5 which is devoted to computer simulations. Beside the ER model there is 
also a very similar construction called the binomial model. Here one starts with N empty nodes and 
joins every pair of nodes with probability p. The name of the model becomes obvious when one realizes 
that the distribution P{L) of the number of links L is given by the formula: 



In this model, also introduced by Erdos and Renyi, the number of nodes is not fixed, but fluctuates 
around (L) = pN(N — l)/2. This means that also the average degree k is a random variable with the 
mean (k) = pN. However, because real-world networks have fixed average degree while their size can be 
very large, in order to compare binomial-graphs to real-world networks one usually scales p oc 1 /N, Under 
this scaling the corresponding graphs have fixed k and thus are sparse. If one calculates the variance of 
P(L) keeping the average degree constant in the limit of N — > oo, one finds that the variance grows only 
as ~ N and that the distribution P(L) becomes Gaussian with the relative width ~ 1/yN falling to zero. 
Thus almost all binomial graphs have the same number of links (L) in the thermodynamical limit and 
therefore the ER and binomial graphs become equivalent to each other for large N. Later on we shall see 
that the ER model defines a canonical ensemble of graphs while the binomial model - a grand-canonical 
ensemble, with respect to the number of edges. In Fig. 12.11 we show some examples of binomial graphs 
for different p. 

Like we have already mentioned, we are interested in properties of the model in the thermodynamical 
limit, that is for very large graphs. The great discovery of Erdos and Renyi was that many motifs, like 
trees of a given size, cycles or the giant component, appear for typical graphs suddenly when p crosses 
a certain threshold value p c . The thresholds are different for different motifs. For p just below p c there 
are almost no motifs of a given type, while for p just above p c the motifs can be found with probability 
one. This is similar to the percolation transition on a lattice. For random graphs, however, p c depends 




(2.1) 
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usually on the system size, N, and must be properly scaled to get fixed values of critical parameters in 
the thermodynamical limit. For example, if one scales p as p = k/N, the desired average degree k plays 
the role of a control parameter. In the limit N — ► oo, 

(k) = k, (2.2) 

and the graph is sparse making it comparable to some real networks. One can ask what is the critical 
value of the control parameter k, for some motifs to appear on the network. Erdos, Renyi and their 
followers were interested in a more general problem. If one assumes that p scales as p ~ N~ z for large N 
with z being an arbitrary real number, what are critical values of z at which some properties appear in 
the thermodynamic limit? Below we present some important findings. 

1) Subgraphs: for binomial graphs one can determine the threshold values of the exponent z when 
subgraphs of a given type appear. One can argue [26] that the average number of subgraphs having n 
nodes and I edges is equal to 

N\ r? 1 . N n n l 

jy )!L^ Ki £UL^N^-" 1 (2.3) 
n J ni ni 

because n nodes can be chosen out of N in ( ) possible ways and they can be connected by I edges with 
probability p . In addition one has to take into account that if one permutes n nodes' labels one obtains 
nl/ni different graphs, where nj is the number of isomorphic graphs. From the formula H2.3J1 one can 
infer the critical value of the exponent z for having at least 0(1) subgraphs of the given type in the limit 
N — > oo. For instance, the critical value of z for a tree of size n is z c = n/(n — 1), since for trees I = n — 1. 
This means that for z > 2 the only subgraphs present in the graph are empty nodes and separated edges. 
When z decreases from 2 to 1, trees of higher and higher size appear in the graph. Finally for z < 1 trees 
of all sizes are present as well as cycles, because for cycles the critical z c is also 1. However, the number 
of cycles of a given length is always constant for z = 1, regardless of the size N. Thus binomial and 
ER random graphs are locally tree-like if p ~ 1/N. Because the clustering coefficient C is proportional 
to the number of triangles (n = I = 3) and inversely proportional to the number of connected triples 
(n = 3,Z = 2), one sees from Eq. (|2.3p that C ~ 1/N and that it vanishes for sparse networks in the 
thermodynamic limit. This is the first property of random graphs that disagrees with empirical results 
for real networks, for which, like we saw in Sec. 1.3, C is always much greater than zero. 

2) Giant component. For the most interesting case of p ~ 1/N, there is a critical value of k c = 1, 
above which a finite fraction of all nodes forms a connected component, called giant component. For 
k = 1 it has approximately TV 2 / 3 nodes but it grows quickly with k so that for k of order 5 and large 
N, more than 99% nodes belong to the giant component. All other clusters are relatively small, and 
most of them are trees. Thus when k passes the threshold k c — 1, the structure of graph changes from 
a collection of small clusters being trees of size ~ In N, to a single large cluster of size ~ N containing 
loops (cycles), and the remaining components being small trees. This behavior is characteristic not only 
for ER or binomial graphs, but it is a general feature of random graphs with various degree distributions 

ED- 

3) Degree distribution. For binomial graphs it is extremely simple to obtain the formula for degree 
distribution n(fe), just by observing that a node of degree k has k neighbors chosen out of N — 1 other 
nodes, and each of them is linked to the node with probability p: 

n(k) = ( N ; 1 ) P k (i-P) N - 1 -\ (2-4) 



k 

which for large N becomes a Poissonian distribution: 

n(fc)«e-*^. (2.5) 

The same function describes the node-degree distribution for ER graphs in the limit of N — > oo. For 
large k the degree distribution has a peak at k w k. Its width grows as Vk, so random graphs with high 
average degree are almost homogeneous in the sense of nodes degrees which assume values very close to 
k. This is a second feature that disagrees with real networks, where H(k) has often a heavy tail meaning 
that there are some nodes with high degrees far from the mean value, called "hubs". 

4) Diameter. As we mentioned in Sec. 1.2 we will calculate the diameter defined as the average 
distance / between pairs of nodes in the network rather than the maximal distance d. We expect that / 
is roughly proportional to d. Certainly it is a good measure of the linear extension of the network. For 
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p=0.1 



p=0.3 



p=0.8 



Figure 2.1: Binomial graphs for N = 10 and various p. For p = 0.1 the graph consists of separated trees. 
For p = 0.3 for which k = 2.7 we are above the percolation threshold k c — 1 and the giant component 
emerges (fat lines in the middle picture). In the limit of p — ► 1 the graph becomes dense. 



both definitions it has been found that above the threshold k c = 1, when a giant component is formed, 
the diameter grows only logarithmically with the size of the graph: 

r InN , , 

Zoc_. (2.6) 

We know that this behavior is called a small- world effect, and is almost always present in real- world 
networks. 



A next construction, which we briefly discuss, is the Watts-Strogatz model [27]. Its main feature 
is that it extrapolates between regular and random graphs. We start with N nodes located on a ring 
(see figure |2~2|) . Each node is connected to K of its nearest-neighbors, so all nodes have initially degree 
K. Then one rewires each edge with probability p to randomly chosen nodes, or leave it in place with 
probability 1 — p. Self- and multiple-connections are excluded. By tuning p one can extrapolate between 
if- regular graph (p — 0) and the maximally random ER graph (p = 1). This model originally arose 
from considerations of social networks, where people have mainly friends from local neighborhood, but 
sometimes they know someone living away - these cases are represented by rewired long-range edges. 

An important feature of this model is that the network can have a small diameter and large clustering 
coefficient at the same time. Let us consider first the limit p = 0. The network is regular and a ring-like. 
Therefore, the diameter d rcg ~ N grows linearly with N. The clustering coefficient C reg is constant and 
larger than zero when N — > oo because the nearest neighborhood of each node looks the same and there 
are always some triangle^. On the other hand, for p = 1 we have the ER random graph for which 
rfrand ~ In N and C ra nd ~ 1/N — + 0. Watts and Strogatz found [27] that there is a broad range of p, 
where d rs d Ian d and C rs C leg . This is the result of a rapid drop of the diameter d when p grows, 
while the clustering coefficient C changes very slow. The diameter decreases fast because even a small 
addition of short-cuts which takes place during the rewiring process, reduces significantly the average 
distance between any pair of nodes. These two properties, namely high clustering and small-world effect, 
agree with findings for many real networks. However, the degree distribution is similar to that of ER 
random graphs. There is no natural possibility to produce a power-law (or generally fat-tailed) degree 
distribution in this model. 

Now we shall describe the Molloy-Reed model [28, 29l[30j or configuration model, which allows for 
construction of pseudographs as well as simple graphs. In the Molloy-Reed model, to build a graph with 
N nodes one generates a sequence of non-negative integers {fci, fe, . . . , fcjv}, almost always as independent 
identically distributed numbers from a desired distribution H(k) and interprets fcj's as node degrees. The 
only requirement is that the sum k\ + k% + ■ ■ ■ + few = 2L is even. In the first step, each integer fcj 
represents a hub consisting of node i and k% outgoing "half-edges". In the second step these "half-edges" 
are paired randomly to form undirected links which now connect nodes (see figure l2~3j) . The number of 
links fluctuates around N(k}/2 if no additional constraint is imposed. In general, this procedure leads to 
pseudographs since sometimes an edge can be created between already connected nodes. To restrict to 
simple graphs one has to stop the procedure every time when a multiple or self-connecting link is created, 
and to start it from the beginning. This can be very time-consuming, especially for degree distributions 
with heavy-tails, where it is unlikely to produce only a single link between nodes of high degree. Thus 
sometimes for practical purposes one does not discard the whole network but only the last move, and 

2 We consider only the case of K > 2 when the network is connected. 
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Figure 2.2: Example of the construction of Watts-Strogatz small-world network. Starting from N = 10 
nodes, each with degree 4, one rewires some edges with probability p. As p increases, the graph becomes 
more random. 



Figure 2.3: Example of the Molloy-Reed construction of pseudographs. We start from N = 5 empty 
nodes with fcj "half-edges" (left-hand side) connected to node i. The numbers ki are taken independently 
from some distribution fl(fc). On the right-hand side we show two possible configurations obtained by 
pairing half-edges. 

chooses another pair of half-edges. This introduces correlations to the network and an uncontrolled bias 
to the sampling. In other words, graphs are not sampled uniformly [31] . 

Using this model Molloy and Reed have shown that for networks with uncorrelated degrees the giant 
component emerges when the following condition is fulfilled: 



For II(fc) being Poissonian one gets the well-known result for ER graphs: k c — 1. The most important 
property of the model is that it allows for power-law degree distribution. Indeed, up to finite-size effects, 
the distribution II(fc) is reproduced correctly. The average path length I has also been calculated [3]; it 
grows as In N with the system size, so again one has a small- world behavior. The clustering coefficient is 
proportional to k/N so it vanishes as for random graphs, but the proportionality coefficient depends on 
II(fc) and may be quite large for heavy-tailed distributions. 

Finally, we shall mention the Maslov-Sneppen algorithm [32] used for obtaining a randomized 
version of any network. The original motivation was to examine whether the appearance of degree- 
degree correlations and other non-trivial properties observed in some biological networks could be entirely 
attributed to the power-law degree distribution. The basic step in this algorithm involves rewiring of 
two edges. One selects two edges: i — > j and k — ► I, and then one rewires their endpoints to get i — ► I 
and k — > j. If this move leads to multiple- or self-connections, one rejects it and tries with another pair 
of edges. To obtain a randomized ("thermalized") version of the given network, one repeats this move 
many times. The algorithm preserves degrees of all nodes, so at the end of randomization the degree 
distribution is the same as for the original network. However, thermalization breaks any correlations 
between nodes which might be present at the beginning. In a sense, one obtains a new network being 
maximally random for the given sequence of degrees {fci, . . . , fcjv}. In next sections we will see that this 
algorithm is also very helpful for generating graphs in a micro-canonical ensemble. 

The four models presented above clearly belong to the class of equilibrated networks because every 
node on the network has statistically the same properties. Nodes have no individual attributes which 
would be correlated with nodes' labels, as one can see if one repeats the process of generation of networks 
many times. In the next subsection we shall explain in a more detailed way what it means and how to 
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define an ensemble of equilibrated graphs. We shall see that graphs from the ensemble can be generated 
in a process of thermalization which homogenizes the network. 

2.1.2 Canonical ensemble for ER random graphs 

The basic concept in the statistical formulation is that of statistical ensemble. The statistical ensemble of 
networks is defined by ascribing a statistical weight to every graph in the given set. Physical quantities 
are measured as weighted averages over all graphs in the ensemble. The probability of occurrence of 
a graph during random sampling is proportional to its statistical weight, thus the choice of statistical 
weights affects the probability of occurrence and, in effect, also "typical" properties of random graphs in 
this ensemble. For convenience, the statistical weight can be split into two components: a fundamental 
weight and afunctional weight. If the functional weight is independent of the graph, graphs are maximally 
random. The fundamental weight tells one how to probe the set of "pure" graphs uniformly, so that each 
graph in the ensemble is equiprobable. In other words, the fundamental weight defines an uniform measure 
on the given set of graphs and should be fixed. The functional weight is the parameter of the model. 

What is the most natural candidate for the fundamental weight for graphs? Consider simple graphs 
with a fixed number of nodes. We can choose the uniform measure by saying that in this case all unlabeled 
graphs are equiprobable, or alternatively that all labeled graphs are equiprobable. These two definitions 
give two different probability measures since the number of ways in which one can label graph's nodes 
depends on graph's topology. It turns out that the latter definition is in many respects better and we will 
stick to it. For instance, with this definition ER graphs have a uniform measure and thus are maximally 
random. There are also some practical reasons. First, in the real world as well as in computer simulations 
node are labeled^]. Second, it is not easy to determine whether two unlabeled graphs are identical or not. 
The problem of graph isomorphism has certainly NP-complexity but it is unknown if it is NP-complete 

For pseudographs, the fundamental weight is most naturally defined by saying that fully labeled 
graphs, that is having nodes and edges' endpoints labeled, are equiprobable in the maximally random 
case. One can show that for this choice each unlabeled graph has the weight equal to the symmetry 
factor of Feynman diagrams generated in the Gaussian perturbation field theory f20j [34] . 

Let us concentrate on simple graphs. Consider again an ensemble of Erdos-Renyi's graphs with N 
labeled nodes and an arbitrary number L of (unlabeled) links. Since each ER graph from this ensemble 
can be in a one-to-one way represented as a symmetric N x N adjacency matrix we see that the uniform 
measure in this ensemble is alternatively defined by saying that all such matrices are equiprobable. What 
about unlabeled graphs? Are they equiprobable in this ensemble? An unlabeled graph is obtained from 
a labeled one by removing labels. We immediately see that each unlabeled graph can be obtained from 
many different labeled graphs. Let us consider the unlabeled one shown on the left-hand side in the 
upper part of Fig. 12.41 Since there are three nodes one can naively think that there are 3! labeled graphs 
corresponding to this shape as shown on the left-hand side of the figure. Actually, it turns out that there 
are only three distinct ones in the sense of having distinct adjacency matrix. Graphs A, C, E are distinct, 
but B is identical to A, D to C, and F to E: 

/ 1 1 \ / 1 \ / 1 \ 

A A = A B = 1 , A c = A D = 1 1 , A E = A F = 1 . (2.8) 
\ 1 / \ 1 / \ 1 1 J 

In other words there are three labeled graphs having this shape. On the other hand, if one takes the shape 
in the lower line of Fig. 12.41 one can see that there is only one labeled graph corresponding to it, since 
all others have the same adjacency matrix. In view of this we see that the probability of occurrence of 
the upper shape is three times larger than of the lower one since the upper is realized by three adjacency 
matrices while the lower has only one realization. 

Let us consider now an ensemble of Erdos-Renyi graphs with N — 4, L = 3. The set consists of only 
three distinct unlabeled graphs A, B, C shown in Fig. 12.51 Each graph has a few possible realizations 
as a labeled graph. One can label four vertices of A in 4! = 24 ways corresponding to permutations 
of nodes 1 — 2 — 3 — 4, but only ua — 12 of them give distinct labeled graphs. It is so because every 
permutation has its symmetric counterpart which gives exactly the same labeled graph, e.g. 1 — 2 — 3 — 4 
and 4 — 3 — 2 — 1. Similarly, one can find that there are tib = 4 labeled graphs for B and nc = 4 for C. 
One can check that indeed by dropping three links at random on four nodes one gets these numbers of 
labeled ER shapes. Altogether, there are ua + tib + nc — 20 labeled graphs in the given set. Because 

3 In real-world networks one can always distinguish nodes for example by names of Web pages, people, scientific papers 
etc. On a computer, nodes are obviously labeled by their representation in computer memory. 



12 



Figure 2.4: Top: the given unlabeled graph can be realized as three different labeled graphs. A is 
equivalent to B, C to D and E to F since they have the same adjacency matrix and hence are identical. 
Bottom: a triangle-shaped graph has only one realization as a labeled graph. 

0—0 
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Figure 2.5: Three possible graphs for N = 4, L = 3, for each of them one of possible labellings is shown. 
The total number of different labellings of these graphs is: ha — 12, tib — 4, nc = 4. 



all labeled graphs are assumed to be equiprobable, the shapes A, B, C have the following probabilities of 
occurrence during the random sampling: 

n,A 3 n B 1 n c 1 ,„ 

PA = — = -, Pb = — = -, Pc = — = -• 2.9) 
n 5 n 5 no 

We see that (unlabeled) ER graphs are not equiprobable - the distribution is uniform only for labeled 
graphs. Let us denote the statistical weights for A, B, C by wa,wb,wc- They are proportional to 
probabilities of configurations and hence wa '■ Wb '■ wc — PA ■ Pb ■ Pc- There is a common proportionality 
constant in the weights, which we for convenience choose so that the weight of each labeled graph is 1/N\. 
For this choice we have wa — 1/2, wb — 1/6, wc — 1/6. The larger is the symmetry of a graph 
topology, the smaller is the number of underlying labeled graphs and thus the smaller is the statistical 
weight. The choice l/N\ compensates the trivial factor of permutations of indices, and thus removes 
overcounting - however, for graphs with fixed number of nodes this particular choice does not influence 
on any physical properties. 

We now apply the above ideas to define an ensemble of ER graphs with arbitrary N, L. The partition 
function Z(N,L) for the Erdos-Renyi ensemble can be written in the form: 

Z(N,L)= E ^1 = E ( 2 - 10 ) 

a'elg(N.L) ' aeg(N ,L) 

where lg(N,L) is the set of all labeled graphs with given N, L and g(N,L) is the corresponding set of 
(unlabeled) graphs. The weight w(a) = n(a)/N\, where n(a) is the number of labeled versions of graph 
a. We are interested in physical quantities averaged over the ensemble. The word "physical" means here 
that the quantity depends only on graph's topology and not on how nodes' labels are assigned to it. It 
is a natural requirement. The average of a quantity O over the ensemble is defined as 

<°^zih) £ ° <a,) M -mri E w(a>OM - (2 - u) 

v ' a'£lg(N,L) y ' aeg(N,L) 

We shall refer to the ensemble with fixed TV, L as to a canonical ensemble. The word "canonical" is used 
here to emphasize that the number of links L is conserved like the total number of particles in a container 
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with ideal gas remaining in thermal balance with a source of heat. Although there is no temperature 
here, the analogy is close because, as we shall see later, these graphs can be indeed generated in a sort of 
thermalization process. 

The partition function Z{N,L) can be calculated by summing over all adjacency matrices A which 
are symmetric, have zeros on the diagonal and L unities above the diagonal |22j. The result is: 

1 



z ^ = w\l)' (2 - 12) 

which agrees with simple combinatorics: there are ways of choosing L links among all possible 

( 2 ) edges. In a similar manner, summing over adjacency matrices, one can calculate averages of various 
quantities. As an example let us consider the node degree distribution IL(k): 



N 

where one can use an integral representation of the discrete delta to get [22] 

*>=G v i)(VM ( ?> 

This is an exact result for ER random graphs. It reduces in the limit k — const, TV — > oo to the Poissonian 
distribution 112.511. 



2.1.3 Grand-canonical and micro-canonical ensemble of random graphs 

So far we have discussed the canonical ensemble of Erdos-Renyi graphs with N, L fixed. If we allow for 
fluctuations of the number of edges, we get the binomial model. The probability of obtaining a labeled 
graph with given L is P(L) = p L (l — p)( 2 ^~ L . Thus the partition function is 

L a£lg(N,L) ' L a£lg{N,L) 

The factor (1 is inessential for fixed N and can be skipped. The new partition function reads 

Z(N, n) = E exp(-nL) Z(N, L), (2.16) 

L 

where = exp(— (a) or equivalently \i — ln-^ 2 . The weight of a labeled graph a is now 10(a) = 
exp(— /iL(a))/N\, where [i is a constant which can be interpreted as a chemical potential for links in the 
grand-canonical ensemble l|2.16p . Notice that the function 1|2.16|1 can be regarded as the generating 
function for Z(N, L). One can calculate the average number of links or its variance as derivatives of the 
grand-canonical partition function with respect to fi: 

(L) - -d„\D.Z{N,ii), (2.17) 
(L 2 )-(L) 2 = %]n.Z{N,»). (2.18) 

Like for the canonical ensemble of ER graphs, the sum of states can be done exactly: 



N\\ L J Nl 

L=0 v 7 



(2.19) 



It is easy to see that for fixed chemical potential fi the average number of links behaves as 



, s N(N — 1 1 N(N- 1 

(L) = p—^-r = y — -. (2.20 

Thus for N — > 00 the graphs become dense; k increases to infinity. We know that this pathology can be 
cured by an appropriate scaling of the probability p: p ~ 1/N. Since /i = ln^p, this corresponds to 
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(j, <~ In TV. In this case L is proportional to N. The corresponding graphs become sparse and the mean 
node degree is now finite. The situation in which fj, scales as In N is very different from the situation known 
from classical statistical physics, where such quantities like chemical potential /i are intensive and do not 
depend on system size N in the thermodynamic limit N — > oo. Moreover, the entropy S = ItlZ(N, L) is 
not extensive - one can show that 

S=^NlnN+ 2 + k ~ klnk N + 0(\nN), (2.21) 

so the system is not "normal" in the thermodynamical sense for k ^ 2. Only when k = 2, that is if 
N = L, the entropy becomes extensive. This means that each graph from this set can be partitioned 
so that we get two sets of graphs A and B, with Na + Nb = N nodes, and the partition function for 
A+B being just the product of the partition functions for A and B. In other words, almost every graph 
in A+B can be constructed by taking two graphs: one from A and the second one from B, and joining 
two of their nodes by a link. In classical statistical physics this means that interactions between A and 
B take place only on the boundary which can be neglected in the thermodynamical limit. In the context 
of ER graphs, the case N — L must therefore corresponds to the set of tree-like graphs - the number of 
loops must be small and they must be short (local). 

As mentioned, the difference between canonical and grand-canonical ensembles gradually disappears 
in the large N limit. It is easy to see why. In a canonical ensemble of sparse graphs the average degree 
k = 2L/N is kept constant when N —> oo while in a grand-canonical it fluctuates around (k) — 2(L)/N = 
k, if fi is properly chosen. However, the magnitude of fluctuations around the average disappears in the 
large TV limit since 

^-(^©(TTFSF- (2 - 22) 

and for fj, <~ hiN the relative width \J (L 2 ) — (L) 2 / (L) <~ N^ 1 / 2 — > 0, so effectively the system selects 
graphs with (k) — k. 

Apart from the canonical and grand-canonical ensembles, one can define a micro-canonical en- 
semble of ER random graphs. By analogy with classical physics, we define it as a set of all equiprob- 
able graphs with prescribed sequence of degrees {fci, . . . , fc/v} which plays the role of the microstate. 
Then the canonical ensemble is constructed by summing over all sequences obeying the conservation law 
ki + ■ ■ ■ + fcjv = 2L. It looks similar to the construction of Molloy and Reed, and indeed, it is its special 
case. We shall make use of the micro-canonical ensemble in Chapter 3 in the context of dynamics on 
graphs. 



2.1.4 Weighted equilibrated graphs 

In the previous section we described ensembles for which all labeled graphs had the same statistical 
weight. They were just ER or binomial random graphs and thus had well known properties. In section 
2.1.1 we pointed out however, that most of these properties do not correspond to those observed for real 
world networks. But the framework of statistical ensembles is very general and flexible and it allows one 
to model a wide class of random graphs and complex networks with non-trivial properties. Consider the 
same set of graphs as in the Erdos-Renyi model but now to each graph in this set, in addition to its 
fundamental weight I/AH, we ascribe a functional weight W(a) which may differ from graph to graph 
so that graphs are no longer uniformly distributed. By tuning the functional weight one can make that 
typical graphs in the ensemble will be scale-free or have more loops, etc. One has a freedom in choosing 
the functional weight. The only restriction on W(a) is that it should not depend on the labeling because 
graphs need to remain equilibrated. We stress that we still have the same set of graphs but now they 
may have distinct statistical weights. 

The partition function for a weighted canonical ensemble can be written as 

Z(N,L)= ]T (l/N\)W(a') = w{a)W{a), (2.23) 

a'£lg(N,L) aeg{N,L) 

where as before w(a) = n(a)/N\ counts labeled graphs. For W(a) = 1 we recover the ensemble of ER 
graphs. The simplest non-trivial choice of W(a) is a family of product weights: 

N 

W(a) = ]J P (k i ), (2.24) 

i=l 
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where p(k) is a semi-positive function depending on degree h of node i. This functional weight is local 
in the sense that it depends only on individual degrees which are a local property of the graph. It 
does not introduce explicitly correlations between nodes, so we will call random graphs generated in 
this ensemble uncorrelated networks. One should, however, remember that the total weight does not 
entirely factorize because the fundamental weight w(a) = n(a)/N\ written as a function of node degrees 
w(ki, k 2 , ■ ■ ■ , k N ) does not factorize since the number n(a) of labelings is not a product of any local 
property of the graph but is a global feature. There is also another factor which prevents the model from 
a full factorization and independence of node degrees, namely the constraint on the total number of links 
2L = k\ + k 2 + ■ • ■ + fcw which for given L and N introduces correlations between fc^'s. For example, 
if one of fci's is large, say ~ 2L, then the remaining ones have to be small in order not to violate the 
constraint on the sum. The effect gradually disappears in the limit L — > oo for a wide class of weights 
p(k) since then the canonical ensemble and the grand-canonical ensemble, for which L does not need to 
be fixed, become equivalent |18j . 

The weight (|2.24p is especially well-suited for studying ensembles with various degree distributions and 
no higher-order correlations. To see how p{k) is related to n(fc), let us first discuss the analogous ensemble 
of weighted pseudographs. They can model networks where self- interactions of nodes are important, as for 
example ecological networks which describe predator-prey relations where cannibalism is often present. 
A pseudograph can be represented by a symmetric adjacency matrix A whose off diagonal entries Ac- 
count the number of links between nodes i and j, and the diagonal ones An count twice the number 
of self-connecting links attached to node i. Each adjacency matrix represents a certain labeled graph, 
but now, due to possibility of multiple links, we label also edges and call such a graph a fully labeled 
graph. To each fully labeled graph we ascribe a configurational weight 1/AH(2.L)!. The weight of each 
labeled graph (where only nodes are labeled) having adjacency matrix A is then 

■m (n 2 >W2 (A M /2)i ) l a-tI Ix^I Jxj (2 - 25) 

where the origin of all symmetry factors is the same as in case of Feynman diagrams and stems from 
possible ways of labeling links (see e.g. [22j). The key points behind introducing pseudographs are: i) 
the set contains the subset of all simple graphs which we are interested in, and, ii) despite a complicated 
form of Eq. (|2 .25(1 . the canonical partition function can be easily evaluated. Let us rewrite the formula 
(|2.23j) for Z(N,L) for pseudographs with functional weight H2.24|) : 

mE^«-"IIp(*> e e n ^'^t^^'^ n^- (^6) 

<? 1 A .. = A .. = * ' *>i v 

0,2,4,... 0,1,2,... 
i—l..N i>j 

Using the standard integral representation of the delta function we can rewrite all sums over Aij as 

i Au=0,2,A,... Ai j= 0,l,2,...,i>j i i>j J 

The sum over diagonal elements gives a product of factors e z ^ 2 . The sum over Aij is also easy to 
calculate and reads Yii>j eZiZj ■ Putting the two results together we find the following factor: e^».j ZiZ ^ 2 . 
Therefore, the partition function is 

Z( ~ N > L ) - M E S Z< / II ^ftK" 1 "*^* ■ (2-28) 

q i 

The last, quadratic term can be expanded by means of the Hubbard-Stratonovich identity: 

exp (4") = vb/ da:exp ( _ T _AE ) • (2 ' 29) 



The discrete delta giving conservation of links can be written as a contour integral, so we get 

~i N 

/ \ fy\ q eXZ 

9 



1 I d V ..-1-2L [ dx „-x 2 /2 



(2.30) 
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The integral over dz yields (xy) q /ql. Changing variables: y — > v = xy and changing the order of 
integration over da; and dv we immediately obtain 



N 



NIJ ^ 

where we have defined the following generating function for weights p(q): 



N\ J 2m 

(2.31) 



Up to now, these results are strict. However, the integral over dv is often hard to calculate for finite 
N,L. Fortunately, the partition function l|2.31jl can be calculated in the thermodynamical limit. The 
saddle-point integration yields: 

(2L - IV 

In Z(N, L) » JVln F(v ) - (2L + 1) In v + In V + . . . , (2.33) 

with vq being a solution to the equation: 

-fgH- (2 - 34) 

We are now ready to calculate II(fc). Since all nodes are equivalent in the equilibrated network with 
product weights (|2.24|) , the degree distribution can be obtained by a simple differentiation of the partition 
function: 

U(k) P{k) dZ{N ' L) P ( !: ) l dlnZ{N > L) (2 35) 

11(fc)_ NZ(N,L) dp(k) ~ P(k) N dp(k) ' (2 - 3d) 

and by applying Eq. (|2.33[) we finally arrive at 

n « - mk- < 2 - 36 > 

This result has been derived in the thermodynamical limit for the canonical ensemble of pseudographs. 
If we try to do the same for simple graphs, the calculation of the partition function is more complicated, 
because if we exclude multiple and self-connections, the weight of each labeled graph is identical, and the 
entries Ay of the adjacency matrix assume now only two possible values and 1. This leads to a change 

of the factor e^( Ei Zi ) in Eq. (gHH to 

Y[(l + ZiZj) =e^^ ln ( 1+ZiZ i\ (2.37) 

i>j 

The integrals over dzi cannot be done in a straightforward way. One can, however, use the following 
expansion: 

n—1 i n—1 \ i / 

and, in order to get the factorization of Zj's, to apply the H-S identity (|2.29p to each quadratic term in 
the second sum over n. This leads to the following, rather formal, integral: 



2m' J 2m 



<?!,■■■,<?« \ i n=l 



If we look at Eq. (f238j) as a perturbative expansion, the integral over dx n gives a "product" correction of 
nth order to Z(N, L). Taking only first few terms in n we get an approximation of Z(N, L), but because 
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we know that Z(N, L) is finite, it is not necessary to take all of them. If we restrict ourselves only to the 
first order n = 1 we get 



A! I 2m 



1 



ckcie 



-xf/2 



L q J 



-z 2 /2+x ± z 



-i N 



(2.40) 



This is indeed a partition function for pseudographs but with single self-connections excluded. Multiple 
connections and double, triple, etc. self-connections are still present. Changing variables y — > v = x±y 
and evaluating the integral over dz we have 



mjsf f£r^ (-£) 



m=0 



m\(q — 2m)! 



N 



(2.41) 



and because the integral over dxi is dominated by the region x\ ~ \fL oc \/A, only the first term in the 
sum over m contributes in the limit of A — > oo. We end up with a partition function like in Eq. (|2.3ip for 
pseudographs. As a by-product we can also estimate the characteristic value of z ~ (q + ~ l/VN 

in the integral over dz in Eq. (|2 .40(1 . Let us consider now the product of integrals in Eq. (|2,39j) and 
try to estimate the characteristic values of x±, . . . , Xoo and Zi,...,zn in order to convince ourselves that 
integrals over da^da^, . . . can be neglected in the thermodynamical limit. Assuming that in the limit 
A — > oo the integral is dominated by a single saddle point, we must find the maximum of the function: 



E 



n (-iy 



(-1)' 



Z- -\- x n z ■ 



^(1 + <7i) ln«f. 



2 " ^ 2n 

n— l \ i / i 

The differentiation with respect to z% and x n gives the following set of equations: 

Vn=l,...,oo: (-l) n nx„ + ^z 4 " = 0, 



V^ = 1, . . . , N : + nXnZ n = q . + h 



(2.42) 



(2.43) 



(2.44) 



The integrals over dzt as well as the sums over qi factorize, thus we can skip indices i because characteristic 
values of all Zi's and all q^s are equal. This allows for solving these equations. We have 



1=1 ~ 7W 



N l-n/2 



(2.45) 
(2.46) 



so ~ vA but x n 's for higher n tend to zero in the thermodynamic limit. This means that the 
only significant contribution to Eq. (|2.39|) is from the integral over d£i. Therefore, Eq. (|2.40p is a good 
approximation. We notice that in the limit A — > oo this equation is identical to Eq. (|2.30p which we 
had before for pseudographs. Thus the degree distribution II(fc) is again given by Eq. (|2.36p . Let us now 
discuss some consequences of that formula. First, for p(q) = 1 the generating function F(v) = e v and 
n(fc) is Poissonian as it should be for equally weighted ER graphs. Second, to get any desired degree 
distribution Il(fc) one should take p(q) = qlli(q) and tune the average degree k so that vq = 1: 



k = k c = F'(l)/F(l). 



(2.47) 



In other words, the number of links and nodes must be carefully balanced to obtain a desired distribution 
H(k): 2L/N = k = J2k kH(k) in the limit of large graphs. For instance, to get a power-law distribution 
one should take p(q) ~ q } -q~ J and adjust A, L carefully. A very important example is the distribution for 
Barabasi- Albert model [l]: 

for k > and 11(0) = 0, which will be discussed in next section. In order to obtain the ensemble 



with II(fe) given by the above formula, one has to choose p(k) = kl- t 



for k = 1,2, 



and 



• fc(fc+l)(fe+2) 

p(0) = 0. The mean of the distribution l|2.48|) is k c — 2 so we have to take A = L to adjust k to this 
value. If L is too small, the degree distribution falls off exponentially for large degrees as one can see 
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from Eq. (|2 .36(1 . because then the saddle point vq < 1. When one exceeds the critical degree k c , the 
saddle-point approximation is no longer valicfl and the behavior depends on whether we consider simple- 
or pseudographs. For simple graphs, the degree distribution has no longer a power-law tail, but has a 
more complicated form. We must remember that for simple graphs Eq. (|2,36p is only an approximation. 
A very interesting behavior is observed for pseudographs. It has been shown [35] that a surplus of links 
condenses on a single node, thus n(/c) has the same power-law distribution as for the critical degree k c , 
but with an additional delta peak whose position moves linearly with the system size N. This is the same 
phenomenon as in the "Backgammon condensation" taking place in the balls- in-boxes model [36J. We 
shall devote one section of Chapter 3 to this problem, so now we will only mention that this is related to 
the divergence of the series (|2.32|) when 2L/N exceeds the threshold k c . In fact, we shall see in Chapter 
3 that the partition function for the balls-in-boxes model is given by the same formula as Eq. (|2.3ip for 
pseudographs and therefore the model can be mapped onto the balls-in-boxes model. 

There is also another problem which should be mentioned here. Equation (|2.36[) is valid only for 
infinite sparse graphs, that is for N — > oo and k fixed. For finite N, the node degree distribution II (fc) 
deviates from the limiting shape due to finite-size corrections, which are particularly strong for fat-tailed 
distributions II(fc) ~ fc -7 . As a result of structural constraints, the maximal node degree cannot be ~ N 
but often it scales as some power of N smaller than one. Corrections to the scale-free degree distribution 
for finite networks will be extensively discussed in section 3.1. 

Let us mention also a particularly important subset of weighted graphs, namely weighted trees [37] . 
Because of their special structure (no cycles), many results can be obtained analytically. For instance, 
for trees with product weights, similarly as for pseudographs one can calculate the expression for II(fc): 

(k - iy.F(v ) 
where the generating function F(v) is now given by 

oo q-l 

F(«)=5>(g),— — . (2.50) 

q=l ^ '' 

Therefore to get a power-law degree distribution one has to take p{k) ~ (k — l)!fc~ 7 . Similarly, one can 
calculate correlations [38] : 

eCM^^^"^ (2-51) 
and hence the assortativity coefficient from Eq. (|1.6p . which for trees with BA degree distribution reads 

showing that this network is disassortative. Trees will be more throughly discussed in section 2.3 in the 
context of comparing the properties of equilibrated and causal networks. 

At the end we shall mention that one can define more complicated weights than those given by 
Eq. (|2.24p . A natural candidate for a weight to generate degree-degree correlations on the network is the 
following choice (39l HO] : 

L 

W(a) = l[p(k ai ,k bl ), (2.53) 
i=i 

where the product runs over all edges of graph, and the weight p(k a , kb) is a symmetric function of degrees 
of nodes a, b at the endpoints of link. One can choose this function to favor assortative or disassortative 
behavior |Mt|4Ql[4ll[42l|43]. Similarly, one can tune the weights to mimic some other functional properties 
of real networks, like for example higher clustering |44 [ 145 } [46 | [47 ] [48] . 



2.1.5 Monte Carlo generator of equilibrated networks 

Only for a few models of random graphs, closely related to ER graphs, one can calculate almost all 
quantities of interest analytically. This is not the general case for weighted networks like those presented 
in the previous section. In some cases it is useful to support the discussion with computer simulations. 
Various methods have been proposed for generating random graphs, but usually each of them works only 

4 See the discussion of the condensation in balls-in-boxes model in Sec. 3.2.2. 
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for one particular model or its variations. In this section we will describe a very general Monte Carlo 
method which allows one to study a wide class of random weighted graphs. The idea standing behind 
this method is to sample the configuration space with probabilities given by their statistical weights. 
Unfortunately, there is no general and efficient procedure that picks up an element from a large set with 
the given probability. The most naive algorithm in which one picks up an element uniformly and then 
accepts it with the probability proportional to its statistical weight has a very low acceptance rate when 
the size of the set is large. Because the number of graphs grows exponentially or fasteo, one clearly sees 
that another idea must be applied. In this section we will discuss such an idea which is derived from a 
general framework of dynamical Monte Carlo techniques. 

The idea is to use a random walk process, which explores the set of graphs, visiting different configu- 
rations with probability proportional to their statistical weights. Such a process is realized as a Markov 
chain (process) which has a unique stationary state with the probability distribution proportional to 
W{a). The Markov chain is defined by specifying transition probabilities P(a — > (3) to go in one ele- 
mentary step from a configuration a to (3. The elementary step is a kind of transformation which carries 
over the current graph into another one. A convenient way to store these probabilities is to introduce 
a matrix P, called a Markovian matrix, with entries P a p = P(a — > (3). For a stationary process, the 
transition matrix P is constant during the random walk. The process is initiated from a certain graph 
an and then elementary steps are repeated producing a sequence (chain) of graphs ao — > a± — > «2 — * • • • • 
The probability pp(t + 1) that a graph (3 is generated in the (t + l)th step is given by: 

P/3(*+l)=$>a(t)Pa/3, (2.54) 

a 

which can be rewritten as a matrix equation: 

p(t + l) -P T PW, (2.55) 

where P r denotes the transpose of P and p is a vector of elements p a . From general theory of Markovian 
matrices [49] we know that the stationary state, characterized by the equation: p(t + 1) = p(t), corre- 
sponds to the left eigenvector of P to the eigenvalue A = 1. If the process is ergodic, which means that 
any configuration can be reached by a sequence of elementary steps starting from any initial graph, and 
if the transition matrix fulfills detailed balance condition: 

Va,(3: W(a)P af3 = W(p)Pp a , (2.56) 

then the stationary state approaches the desired distribution: p a (t) — > W(a)/Z for t — > oo. In other 
words, in the limit of infinite Markov chain, the probability of occurrence of graphs becomes proportional 
to their statistical weights and is independent of the initial graph. However, one must be careful while 
generating relatively short chains. First of all, the probabilities can strongly depend on the initial state, 
and one has to wait some time before one starts measurements, to "thermalize" the system, i.e. to reach 
"typical" graphs in the ensemble. Second of all, consecutive graphs in the Markov chain may be correlated, 
especially when the elementary step is only a local update. Therefore one has to find a minimal number 
of steps for which one can treat measurements on such graphs as independent. 

Among many possible choices for probabilities P a p, which lead to the same stationary distribution, 
we shall use here the well-known Metropolis algorithm [50J, based on the following transition probability: 

The algorithm works as follows. For the current configuration a one proposes to change it to a new 
configuration (3 which differs slightly from a and then one accepts it with the Metropolis probability 
(|2,57|) , Repeating this many times one produces a chain of configurations. The proposed elementary 
modifications (steps) should not be too large because then one risks that the acceptance rate would be 
small. Therefore, all algorithms which we propose below attempt in a single step to introduce only a 
small change to the current graph, by rewiring only one or two links. 

Let us try to apply these ideas to write a Monte Carlo algorithm for generating weighted graphs from 
the canonical ensemble. A good candidate for elementary transformation of a graph is rewiring of a link 
called "T-move" (see Fig. 12-6(1 . because it does not change N and L, fixed in the canonical ensemble. 
We choose a link ij and a vertex n at random, and rewire one of the endpoints of the link, say j, to n, 
forming a new link in which replaces the old one ij. Sometimes it is easier to think about a rewiring of 
an oriented link i — > j to i — > n, and simultaneously, j — > i to n — > i. If the link in is already present, 
or if n = i we reject this move to prevent from forming a multiple- or self-connection. Then, each move 

6 For instance, for ER model it grows faster than exponentially which results in a non-extensive entropy of graphs, see 
Sec. 2.1.3; for introduction on counting graphs see also the reference |51j . 
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Figure 2.6: The idea of "T-move": a random link (solid line) is rewired from vertex j to a random vertex 
n (left-hand side). Alternatively (right-hand side) a random, oriented link (dotted line) is rewired from 
vertex of its end j to a random vertex n. The opposite link j — > z is simultaneously rewired. 

is accepted with the Metropolis probability I|2.57J1 . In a special case of the ensemble of Erdos-Renyi 
graphs this probability is equal to one since every graph has the same weight. So ER graphs can be 
alternatively obtained by simple rewirings of any other graph - we call this process "thermalization". 
The thermalization (homogenization) can be used to generate any ensemble of equilibrated networks. In 
fact, one could define equilibrated networks as graphs, which can be produced in a procedure like this, if 
weights do not depend on nodes' labels. This process destroys any correlations which might be present 
due to causal growth of the initial network. 

One can show [22l that, indeed, this algorithm produces labeled graphs with desired probabilities. 
We skip here the details. We would like, however, to point out two difficulties which can be encountered. 
First, it is not clear whether the ergodicity is not broken in the limit N — > oo for models where the number 
of configurations grows with N faster than exponentially. Second, for some classes of (unphysical) weight 
functions, a local algorithm may not be ergodic. Consider for example weight functions W(a) of the form 
(|2.24|) . with p(k) being a function which is strictly positive on a support which has a gap in the middle 
- an interval k € (fcijfe), where p(k) — 0. In other words, there are no configurations in this ensemble 
which have a node with degree k 6 (ki,k 2 ). A single rewiring can change degrees only by ±1, so it is not 
possible to change the value of k from k < ki to k > k% since it would have to go through the forbidden 
region (fci, fo). In this case, in order to avoid the difficulty one would have to invent an algorithm which 
is able to significantly change k in a single move, to jump from one to another part of the support of the 
weight function p(k). We shall not, however, consider such unphysical weights p(k). For weights, which 
are physically important, the support of the weight function is connected. In this case the acceptance 
probability reads 

p, m • L Wtf)\ . f p(kj-l) P (k n + l) \ . f W(k n ) \ 

P a (a - /J) = mm jl, _ j = nunjl, j = „un|l, —-^ j , (2.58) 

where we have introduced an auxiliary function: 

The deg taken from the current graph a. In the computer algorithm we prefer to use the 

weight function w(k) instead of p(k) to reduce computational cost and round-off errors. In fact, w(k) can 
be exactly calculated for many important p(k)'s, which we are interested in. For example, to get the BA 
degree distribution in simple graphs, according to Eq. <|2.48[1 . we have to choose p(k) = MUsAik) and 
hence 

k(k + l) 

w ^ = fc + 3 ( 2 - 6 °) 
while for trees, because of the factor (k — 1)! in Eq. (|2.49p . 

W (k) = (2.61) 

The rewiring procedure described above does not change N and L. If we want to simulate weighted 
graphs from the grand-canonical ensemble, we have to choose another transformations which change 
the number of links L. Natural candidates for such transformations are two reciprocal transformations: 
adding and deleting a link. In order to add a link we have to choose two vertices to which the addition 
is attempted. To remove a link we pick up one link out of all L existing in the graph. These two 
transformations must be carefully balanced in order to get graphs with correct probabilities. If the 
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Figure 2.7: The idea of "X-move": two oriented links (dotted lines) ij and In chosen in a random way are 
rewired, exchanging their endpoints. Then the opposite links (solid lines) are also rewired. 



frequency of the two transformations is the same, then the acceptance probabilities for each of them are 
given by [22J 

f N 2 W(B) 1 

P add( . - /9) = min{l,exp(-,) ^—^ ^gj , (2.62) 

and 

P dc i(/3 -kx)= min (l, exp(+/i) ^ tttt^tI > 



iV 2 J 



(2.63) 



respectively. Here fi is the chemical potential for links, defined in Eq. I|2.16p and chosen to obtain a 
desired average number of link^H (L). As before, if we want to produce only simple graphs we must 
eliminate moves leading to self- or multiple connections. One could modify the algorithm in many ways. 
For example, one could, instead of picking up a link as a candidate for removing, pick up two nodes 
at random and if there is a link between them, remove it. Then the fractions N 2 /2(L + 1) and 2L/N 2 
would disappear from equations (|2.62|) and (|2 .63(1 - This would not change the probabilities of graph's 
occurrence, but it would affect the acceptance rate. For sparse networks such a modified algorithm is 
worse than the previous one because the chance that there is a link between two randomly chosen nodes 
is very small and for most of the time the algorithm would do nothing except looking for links that can 
be removed. On the other hand, the acceptance rate for the original algorithm is finite for N — > oo since 
then fj, behaves as In Af and the factor e M 2L/A 2 is of order 1. 

Let us consider now a version of this algorithm suitable for the product weights (|2 . 24[) . The probability 
of acceptance of a new configuration by adding or removing a link between ij reads: 



N 2 
2{L + 1) 



min -j 1, nt - ; - - e ^w{ki)w(kj)\ for addition a link, 



""{^ ^ft-lHVl) } for deleting a link, 

where L and ki,kj refer to the current configuration and w(k) is given by Eq. (|2 .5Q|) . 

Finally, let us say some words about the generation of graphs from the micro-canonical ensemble. 
Inspired by the Maslov-Sneppen algorithm preserving node degrees, as a local update we choose simulta- 
neous rewirings shown in Fig. 12.71 We shall call this combination "X-move". At each step of this algorithm 
one picks up two random links: ij and In, and rewires them to in and Ij. In case of the Maslov-Sneppen 
algorithm [32j, the functional weight is W a — 1 and hence all rewirings are accepted. In the general 
case, one can use this algorithm to generate graphs whose statistical weights depend for instance on the 
number of triangles, to get a high clustering coefficient, or to produce some higher-order correlations 
between nodes [52J. The motivation is similar to that given by Maslov and Sneppen, namely if one tries 
to determine relations between the abundance of structural motifs and the functionality of network, it is 
very important to construct randomized networks which could serve as a benchmark. 

We described the algorithms presented here and their implementation in |53j. Because we often have 
to do with sparse graphs, it is not needed to keep the whole adjacency matrix in computer memory. The 
data structure that we developed allows us to encode and simulate networks of size of order 10 6 . 



2.2 Causal (growing) complex networks 

In the previous section we discussed equilibrated networks, which can be constructed in a sort of thermal- 
ization or homogenization process. Clearly for such graphs, if they are labeled, a permutation of nodes' 

6 For complicated weights, when analytical calculations of the correspondence (L) <-> fj, is impossible, one can tune /i 
during the simulation to obtain desired number of links. 
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labels leads to the same set of graphs. In this section we shall discuss another kind of graphs, generated 
in a process of growth. A common feature of these graphs is that there is a natural labeling of nodes 
which corresponds to the order in which they were added to the graph. We call this labeling "causal", 
since it is always obvious which node is an ancestor of which. The corresponding graphs will be called 
causal networks. The causality introduces a restriction on the number of ways in which the graph can 
be labeled. As we shall see, this restriction very strongly affects properties of typical networks in the 
ensemble. 

In this section we shall discuss some famous models of growing networks. These models are more 
popular than equilibrated networks presented above and, indeed, they were first models reproducing 
many properties of real network. Although in some models the rules governing the process of growth 
may look somewhat artificial, it is instructive to study how methods of statistical physics can be applied 
to causal networks. Because of the growing nature of these networks, the rate equation approach is 
particularly well suited to study them. We will see, however, that also the concept of statistical ensemble 
can be very helpful in order to understand some features of these networks. 



2.2.1 Models of growing networks 

As a first example of growing network we shall discuss the Barabasi- Albert (BA) model introduced 
in a very seminal paper [15]. This model triggered enormous activity in the field of complex networks. 
Similar models were proposed in different contexts and discussed several times in the past (for review see 
e.g. [54]). The model has two basic ingredients: growth and preferential attachment. The latter means 
that new nodes added to the system prefer to attach to nodes with higher degrees. In effect, high degrees 
are further increased and become even higheiQ. The model is defined as follows. Starting from a complete 
graph with no nodes, at each step a new node is introduced and joined to m previously existing nodes 
with the probability proportional to the degree of the node to which a new link is established. One can 
easily program this procedure on a computer, adding nodes one by one and attaching them according to 
the preferential attachment rule. There is also a slightly different version of the algorithm, a more tricky 
one, which instead of focusing on the nodes uses links as elementary objects. It is more effective, so let us 
shortly describe it. Each link ij is viewed in this algorithm as a couple of directed links i — * j and j — ► i. 
In the algorithm one picks up at random a directed link and chooses the node which is at the endpoint 
of this link as a node to which a new link is going to be attached. The preferential attachment rule is in 
this way simply realized, because the number of links pointing onto a node with degree k is equal to k. 
After t steps of nodes' addition, the network consists of no + t nodes and mt + no(no + l)/2 edges. For 
m = 1, the graph generated by this procedure consists of trees planted on the initial graph. If the initial 
graph is a tree, so is the whole graph. 

Later on we shall see that the degree distribution falls asymptotically as k~ 3 . In the limit N — > oo 
the distribution reads |55J 

where 0(x) is the step function: Q(x) = 1 for x > 0, 0(x) = for x < 0. By construction, nodes of degree 
smaller than m are absent. The degree distribution (|2.64|) is in accordance with distributions observed 
for some real networks like the citation network. The exponent 7 = 3 cannot be tuned in this version of 
the model. As we shall see below, a slight modification of the attachment rule will do the job. The next 
important property of the BA network is that the diameter grows as ~ InN, so it is a small-world. The 
clustering coefficient is rather small. For m = 1, C = because the graph is essentially a tree. For m > 1 
many triangles appeal, but their number is small in comparison to the number of connected triples in 
the limit N — > oo. There are obvious correlations on BA networks between the age and the degree of 
nodes: the older node is, the higher degree it has. This is an effect of a pure growth in absence of any 
rewiring of links. In fact, this age-degree correlation is not observed in the WWW, for which the model 
was originally designed, because there are new web pages having sometimes more links than older ones. 

As mentioned, many refinements have been introduced to the BA model to account for some of the 
experimentally observed facts. In particular, one can make the power-law exponent tunable by a simple 
modification of the attachment rule as proposed by Dorogovtsev, Mendes and Samukhin [55]. Here we 
shall refer to this model as to the DMS mo or as to the BA model with initial attractiveness. The 
algorithm is similar to that for the BA model. The only difference is that now a new node chooses the 



This is sometimes called St. Matthew's effect: "For unto every one that hath shall be given, and he shall have abundance: 
but from him that hath not shall be taken away even that which he hath." (Matthew XXV:29, KJV). 
8 E.g. for m = 2 for each new node one new triangle is also introduced. 

9 There are also other models proposed by those authors called DMS models in the literature. 
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older one to which it creates a link, with probability A k , called attachment kernel, proportional to its 
degree plus some constant: 

k + an , 

Ak = ^ vi ' 2 - 65 

where k is the degree of the old node and an is called initial attractiveness. The model can be solved in 
the thermodynamic limit [55]. The degree distribution for m — 1 reads 

that is II(fc) ~ k 1 with 7 = 3 + an. The model can reproduce any power-law exponent larger than 2 
(do > — 1), and therefore it can be adjusted to experimentally observed degree distributions for real-world 
networks. One can summarize this part of the discussion by saying that DMS model became very popular 
because of three important properties: i) it yields scale-free networks with tunable exponent 7, ii) the 
networks are small-worlds, iii) the model is easy to handle in the numerical and analytical treatment. 
Actually, DMS networks can be easily generated but not so easy as BA ones. The innocently looking term 
an in the attachment kernel changes the algorithm complexity, because one cannot apply the trick with 
picking up directed links at random instead of nodes. One has to work with nodes and choose them with 
a probability changing after each step, which increases the computational cost. Fortunately, it was shown 
in |56| that the model with m = 1 is equivalent to a model of growing network with re-direction 
(GNR). The GNR network is constructed as follows. Starting from some small initial graph like in the 
BA model, at each time step one chooses a node i with equal probability from the set of existing nodes. 
Then one introduces a new node which is attached with probability 1 — r to i, and with probability r to 
its ancestoi0- With the choice r — l/(ao + 2) the GNR model is equivalent to the DMS tree model with 
initial attractiveness an. 

One can consider even more general attachment kernels than Eq. (|2 .65(1 - For instance, one can assume 
that Ak behaves asymptotically as k a for large k. When a < 1 that is for sub-linear kernels, the degree 
distribution is exponentially suppressed [56j. When a > 1, links tend to condense on one or more nodes, 
depending on the value of a: for instance for a > 2 almost all links condense on a single node. This 
is the "winner takes all" situation. We have mentioned a similar behavior in the previous section while 
discussing pseudographs, but there only a finite fraction of links condensed. The situation presented here 
is more similar to the condensation of balls on inhomogeneous networks which will be discussed in section 
3.2.2. 



2.2.2 Rate equation approach 

In this subsection we shall discuss rate equations and show how to use them to calculate asymptotic 
degree distribution for a growing network. We shall follow the approach developed in [56]. First, for 
simplicity we shall consider BA model with m — 1, that is the ensemble of growing trees with linear 
attachment kernel. The quantity of interest is Nk(N), the number of nodes having degree k when the 
total size of the network is N . Assume that the initial graph consists of two nodes joined by an edge. 
This means that initially we have N k {2) = 28k,i- For m = 1 the growth process does not introduce 
cycles, so the graph remains a tree. The assumption about the initial configuration is not crucial but it 
simplifies calculations. At each time step a new node is attached to an old node with probability equal to 
k/ J2 q qN q , where k is the degree of the old node. Because the sum of all degrees gives 2L, this probability 
is simply The process of growth is random, Nk(N) may change by or 1. We can formally write: 

N k {N+l) = N k {N)+£(k,N), (2.67) 

where £(fc, N) is a random variable which may assume values 0, 1. Having the probability distribution of 
£ we could generate £(fc, N) at any time step N and simulate the process of growth to get N k (N). But 
we are interested not in a particular distribution of degrees for one network, but in "typical" properties 
of all BA graphs. Therefore we should consider the average (Nk(N)) rather than N k (N). The average is 
over an ensemble of all graphs of size N which can be generated by the growth process. One can show 
[55] that this average exists in the limit N — > 00 and that the system self- averages, which means that for 
N — > 00 the averages over the ensemble are equal to the averages over one network picked up from this 
ensemble. Taking the average of both sides of Eq. (|2.67p we get 

(N k (N + 1)) = (N k {N)) + (£(fc, AO) . (2.68) 

10 Because the network is growing (causal), one can always decide which node is older and fix the ancestor-descendant 
hierarchy. 
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The form of the average of the random variable £ can be deduced from the process of growth. Let us 
focus at some node i having degree t As a result of an addition of new node to the network, i can get a 
new link with probability k/2L. Thus the average change of (Nk(N)} will be — (Nk(N)) k/2L because it 
happens only when the new node chooses one of Nk(N) possible nodes with degree k. But (Nk(N)) can 
also increase by (Nk-i(N)) (k — 1)/2L if the new node connects to any node with degree k — 1. The last 
contribution to (£) comes from addition of a new node with degree k — 1 and is equal to 4,i- Thus the 
full equation for the rate of change of (N k (N)) reads: 

(N k (N + 1)> = (N k (N)) + 4,i + 2{ k N ^ 1} (Nk-i(N)) - 2{N k _ (N k (N)) , (2.69) 

where we take advantage of the fact that for trees L = N — 1, The equation is exact for any N, not only 
in the thermodynamic limit, and could be solved for (N k (N)}. Using this equation one can also calculate 
n(fc) = (N k (N)) /N, i.e. the degree distribution averaged over the ensemble of BA tree graphs. As we 
shall see it is not an easy task (see Chapter 3, Sec. 3.1.2). It can be simplified by neglecting finite-size 
corrections in the limit of large networks, in which case the degree distribution can be calculated by 
substituting (N k (N)) ~ NTl(k) and assuming that H(k) tends to a stationary statjjj]. In this case one 
gets: 

n(fc) = 4,i + ^n(fc - 1) - ^n(fe) + o(i/n). (2.70) 

In the thermodynamical limit the term 0(1/ N) can be neglected. Rearranging this equation: 

(k + 2)II(fc) = (k - l)n(fc - 1) + 24,i, (2.71) 

one immediately obtains 

n(l) = 2/3, (2.72) 

n(fc) = Iq^ 11 ^- 1 )' Vfc>1 > ( 2 - 73 ) 

and by iterating Eq. <|2.73|) one eventually arrives at the following degree distribution for k > 1: 

U{k) = k(k + l)(k + 2Y < 2 ' 74 > 

Following [56J, let us apply the same method for a general attachment kernel A k . Now, the probability 
that a new node will be attached to the older one with degree k, is A k /A(N) where A(N) is a normalization 
coefficient: 

A(N) = J2 A kN k (N)- (2.75) 

k 

In the limit N — > oo, all N k ~ N and thus we can assume that A(N) rss Nrj where r\ is some constant 
to be determined later. Proceeding exactly as above for pure BA model, we get the rate equation in the 
form: 

n(jfe) = s k i + ^in(fc - 1) - — n(fc), (2.76) 

T) T) 

which can be solved with respect to n(fc): 

nw = in(i + ^)"'. ("7) 

3=1 J 

The parameter rj can be obtained from the normalization of the degree distribution: J^k n(fc) = 1. If we 
now assume a shifted linear kernel like in the DMS model: A k = ao + k, we find 

n(fc) = 11 a ° + 1 a ° + k = 71 r(o + fc + i)/Q ! 

ao + k ao + 1 + r\ a + k + rj a + k T(a + k + rj + l)/(ao + r))\ ' 

and j) = 2 + a as follows from A(t) = J2k(^ + a a)Nk(N) = 2L + Na a . Inserting this into Eq. (|2.78j) we 
end up with Eq. (|2.66p . It reduces to the BA degree distribution 1|2.74[1 for ao = 0, that is for the purely 
linear attachment kernel. In figure 12.81 we show plots of II(fc) calculated analytically using Eq. H2.78J1 
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Figure 2.8: Left: Degree distribution for pure BA model with m = l,no = 2 for networks of different 
sizes. Solid lines from left to right: N = 1000,2000,4000, averaged over 10 6 generated networks. Circles: 
n(/c) for a single network with N = 10 7 . As N grows, plots approach theoretical distribution H(k) ~ k~ 3 . 
One also sees that averaging over the ensemble is (up to finite-size effects) equivalent to averaging over 
one large network (self-averaging). Right: plots of H(k) for DMS models with N = 10 6 and various a , 
compared to the theoretical distributions (|2.66p . The plots correspond to ao = 2.1 (the smallest slope), 
ao = 3 and ao = 4 (the largest slope). All results are averaged over 100 networks generated in the GNR 
model, equivalent to the DMS model. 



and measured in numerical simulations of networks generated by the GNR version of growing network 
algorithm for various ao. 

The same method allows one to determine II(fc) for sub- and super-linear kernels, cited in the previous 
section, or to calculate degree distribution for non-tree growing networks [SSI [57]. It can also be used to 
find degree-degree correlations [56] by writing a rate equation for Nk, q , the number of nodes with degree 
k attached to ancestor nodes of degree q. The exact result for BA is fairly complicated, but in the limit 
k, q — > oo with y = q/k kept fixed, it simplifies to 

This function has a maximum ati/~ 0.372 which means that the ancestor node's degree is approximately 
37% of its descendant. The correlation function ek q defined in Sec. 1.3 and calculated from the formula: 

e(k,q) = Nk ' 9+ L Ng '\ (2.80) 

does not factorize: e(k, q) ^ e T (k,q) which means that the network is correlated. A similar behavior 
is observed for shifted attachment kernels. The assortativity coefficient A defined in Eq. 1|1.6|1 can be 
calculated for pure BA model. From Eq. (41) in [SB], and Eq. (|2.80p we obtain 

„ , N 2(5/c 2 -3fc-2) 

< k ^= whww^j - (2 - 81) 

Using Eq. (|1.6p after some tedious but straightforward calculations we find: 



, 33 - 241n4 

42 _ 47r2 « -0-1075, (2.82) 

which stands in a very good agreement with numerical simulations. This indicates that the BA growing 
tree network is disassortative. 

Many improvements of BA growing network models have been proposed (for a review, see e.g. [4j) . The 
growing BA network can be used as an initial configuration for the algorithms, like those described before, 
to generate scale-free networks with some features enhanced [6]. In this way one can also extrapolate 
between causal and equilibrated networks. 



11 One can show that N^'s from Eq. I|2.69p grow as ~ N |56| for large N and therefore II(fe) has a stationary state in the 
thermodynamical limit. 
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Figure 2.9: Planted rooted trees of size N = 1,2,3. All causal labellings are surrounded by dashed 
rectangles. 



2.2.3 Statistical ensemble formulation of growing networks 

Although many properties of growing networks can be understood using rate equations, sometimes it is 
convenient to introduce the ensemble of causal graphs and calculate desired quantities from the partition 
function. As mentioned, such an ensemble cannot be thought as an ensemble with the Gibbs measure, 
in the usual statistical sense, but merely as an ensemble of networks which can be obtained in a growth 
process, if this process is terminated at some moment of time. In this subsection we shall define such an 
ensemble for trees with the product weight <|2.24[) . As we shall see, the model is on the one hand solvable 
and on the other hand it exhibits non-trivial behavior. In particular, we shall be able to quantify the 
effects of causality. 

Because a tree with N nodes has L = N — 1 links, we shall define the canonical partition function 
which depends only on N: 

Z ^ = jn. ^ (2-83) 

' aelct(N) 

where the sum runs over all labeled causal tree graphs. The causal ordering of nodes' labels selects a 
relatively small fraction of all possible labeled trees. The calculation of Eq. (|2.83|) is much simpler for 
planted rooted trees, i.e. trees with an additional link (a stem) attached to one of its nodes. The stem 
acts as an additional link which marks one node of the tree and increases its degree by one. Because only 
one node is marked, in the thermodynamic limit ensembles of trees and planted rooted trees have roughly 
the same properties. In figure [2U1 graphs with N — 1,2,3 are sketched. Following [33 we shall derive 
a recursion relation for Z{N). First, we observe that every tree of size N + 1 can be constructed from 
trees of sizes N%, . . . , N q where J2i=i N% — N, by attaching their stems to a common node (see Fig. l2.10j ). 
This new node is attached to a new common stem. Denoting by n(N) the number of different labellings 
for the set of trees of size N we have 

N\ 1 

" (JV + 1) = JV 1 !-.JV ? ! / W,,, " W ' (2 - 84) 

The origin of factorials is the following. The whole tree has N + 1 labels, but the smallest label must 
be attached to the root because of the causality. The remaining N labels can be distributed arbitrarily. 
All N^. permutations of Ni labels of a subtree are undistinguishable and thus they give the same graph. 
To avoid overcounting one divides by A/,!. This leads to the multinomial factor. In addition, q subtrees 
can be permuted in q\ possible ways giving the same labeled graph, thus we have to divide by ql. The 
functional weight W{N + 1) for the set of compound trees also factorizes: 

W{N + 1) = p(q + l)W(JVi) • • • W(N q ). (2.85) 

Notice that the new node has degree q + 1 because the root is counted as a link. The partition function 
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Figure 2.10: Construction of a new planted rooted tree from q — 3 trees of size N% , N 2 , N 3 . The large 
circle denotes the set of all trees of given size. One node (small circle) is distinguished by the stem joined 
to it. The new set of trees of size Ni + N 2 + N 3 + 1 is obtained by joining the trees from the left-hand 
side to a new common rooted node (right-hand side). The new node has degree four since the root is 
counted as a link. 



(|2.83p can be expressed as a sort of self-consistency equation for Z(N): 

1 °° N\ 1 

Z{N + 1) = TNTTV^ E ^,N 1+ -.. + N g , ; N ^ n(N l )...n(N q ) 

^ q=lN u ...,N q L ' q ' H ' 

X p{q + l)W{N 1 )---W{N q ) 

= ]vTlE^r^ E W,n« (2-86) 

q=l q ' N 1 ,...,N q i=l 

where Z(N) appears on both sides. The sum goes over all subtrees 1,2, ... ,q of sizes Ni,...,N q with 
the only constraint given by the delta function. The constraint can be decoupled by introducing a 
grand-canonical partition function: 

oo 

Z( M ) = Z(N)e- N ^ (2.87) 

N=l 

which is just a generating function for the canonical partition functions Z(N). Here fi is the chemical 
potential but contrary to the previous definition (|2 . 16|1 . it controls the average number of nodes, not 
linkfl- Multiplying both sides of Eq. (|2.86p by (N + l)e~( N+1 ' ltl and summing over N — l,...,oo we 
get: 

oo oo , 1 s / oo \ 1 

J2 NZ(N)e- N » = e~" £ m+R £ e -^Z(JV,) . (2.88) 

N=2 g=l ^' \Ni = l J 

If we add the term Z{l)e~^ to both sides of this equation, the left-hand side becomes just a derivative of 
—Z(fi) with respect to fi, while the right-hand side is a sum over q extended to the range q = 0, . . . , oo, 
which additionally includes the term for q — 0. Thus we get 

Z'fa) = -e-"F(Z( M )), (2.89) 

where F(x) is the generating function for the distribution p(k) like in Eq. (|2 .50(1 : 

°° x q 

F(x) =5>(g+l) — . (2.90) 

This series may have a finite or infinite radius of convergence, xq. The equation (|2,89p can be integrated 
over d/z. This yields 

and becaus^l F(x) > for x > and F(x) — > oo for x > xq, the integral is bounded from above. Hence 
the chemical potential ii(Z) is bounded from below: /i — > fiQ as Z — > oo. This means that Z as a function 

12 Because L = N — 1, the difference is in fact meaningless. 
13 We can exclude the trivial case when all p(q)'s are zero. 
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Figure 2.11: Typical behavior of functions discussed in section 2.2.3. 
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of ji has a singularity at Mo (see Fig. I2.1lj ): 



Mo 



In 



dx 



(2.92) 



From definition of the partition function (|2.87j) we have that Z(N) shall grow as ~ e^° N or faster. 
We assume now that the ensemble of trees is normal in the statistical-thermodynamical sense, that is 
Z(N) oc e^° N . As we have seen, this is not true for simple graphs, but the number of causal trees grows 
only as (N — 1)! [35] and not as ~ (iV 2 )! for graphs. Therefore, many quantities as for instance degree 
distribution can be obtained from the critical value Mo- For example, according to Eq. (|2.35|) the degree 
distribution reads 

,, s f x o dxx k 

p(k) Jo 



U(k) = p(k) 



dMo 
dp(k) 



(2.93) 



Thus, similarly as for simple graphs, by tuning p(k) one can obtain any desired degree distribution. It is, 
however, not as trivial as in case of Eq. (|2.36j) because the dependence on F(x) is now more complicated. 
Some interesting distributions were investigated in |37J. For instance, with the choice p(k) = (k — 1)!, the 
generating functions reads F(x) = (1 — a;) -1 and has the radius of convergence xq = 1. The integrals in 
(|2.93p can be done analytically. The result is 



n(fc) = 



fc(ft + l)(jfe + 2)' 



(2.94) 



so one recovers the BA degree distribution. We can show that, indeed, causal trees with the product 
weight 

W(a) = (fci-l)!---(fcAT-l)! (2.95) 

form the same ensemble as BA growing trees. To this end, let us consider a set of all causal trees a which 
for a given degree sequence fei, . . . ,kjf have the statistical weight W(a) given by Eq. (|2.95j) . Imagine 
also that we have a Markov process which generates such trees. First, we see that the number of possible 
causally labeled trees in this set is obviously the same as in the BA model. We have to check whether 
also the statistical weights are the same in both cases. Imagine that we take a graph a with N nodes 
with degrees fei, . . . , kjy and attach a new node by linking it to a node n. We obtain a new configuration 
(3, which has now N + 1 nodes with degrees hi, . . . , k n + 1, . . . , fcjv, 1. The transition probability a — > [3 
for a process which has a stationary distribution (|2.95p is 



W{(i) = (k 1 -l)\---k n l---(k N 
W{a) 



l)\ -1 



(fci -!)!.••(** -1)! 



(2.96) 



and we see that it is identical to that for linear attachment kernel in the BA growth process. In conclusion, 
this shows the equivalence of the two approaches. 

The formulation of the B A model of growing networks via statistical ensemble can be used to calculate 
degree-degree correlations or the average distance (r) between any two nodes [37]. For instance, it can 
be found that (r) = (1/2) In N, that is the BA network really displays the small- world phenomenon. 



2.3 Causal versus equilibrated networks 

So far we have discussed equilibrated and causal networks separately. We have shown that for both 
classes some properties of networks can be essentially the same, as for instance the power-law degree 
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Figure 2.12: The distance distribution G(r) for unweighted equilibrated trees (circles), scale-free equili- 
brated trees (diamonds) and scale-free causal trees (squares) of size N = 1000. S-F causal trees are the 
shortest. 



distribution. Now we shall compare these two ensembles and see that although H(k) can be identical in 
the thermodynamic limit, some other characteristics of the network topology are different for causal and 
equilibrated graphs. As shown in [35], causal trees form only a small subset of all trees. The same is true 
for simple graphs. The fraction of causally labeled trees among all labeled trees is only ~ N 3 ' 2 e~ N , So 
the chance of picking up at random a causal tree from the set of all trees vanishes when N grows. We 
shall show that geometrical properties of typical trees in this subset are quite different from those in the 
whole set. 

Let us first consider the ensemble of unweighted equilibrated trees and the corresponding ensemble for 
causal trees. Here "unweighted" means that all trees have the same functional weight equal to one. We can 
now calculate some geometrical quantities for trees in the first and in the second ensemble. An example 
of such a quantity is the average distance (r). In fact, one can calculate it analytically [37 ] [56 l l58l 1591160] . 

For equilibrated trees it is 

(r) ~ y/N, (2.97) 

which means that the fractal dimension of typical equilibrated trees is equal to 2. These trees are therefore 
rather elongated and certainly are not small-worlds so abundantly observed in nature. On the other hand, 
for causal trees, 

(r)-lniV, (2.98) 

hence the fractal dimension is infinite. This is because most of nodes concentrate around the oldest node. 
A similar observation was made for weighted trees with BA degree distribution [35]. An even better 
insight into geometrical properties of trees (or graphs) is provided by the distribution G(r) of distances 
r between all pairs of nodes: 



G(r) 



■5>, 



r(ij) 



(2.99) 



Here r(ij) is the length of the shortest path between two nodes The average distance is the mean of 
this distribution: (r) = ^2 r rG(r). In figure \2A2\ we present a comparison of G(r) for equilibrated and 
causal trees of the same size. Causal trees were generated using the BA model while for equilibrated 
trees we used the Monte Carlo algorithm described in section 2.1.5. The weights p(k) = 4(fc — l)!/(fc(fc + 
l)(fc + 2)) were chosen according to Eq. (|2.49[) to get the same degree distribution as in the BA model. In 
Fig. 12.131 we see that indeed both types of trees have the same n(fc), so one cannot easily distinguish to 
which ensemble the given tree belongs, by only measurin £3n(fc). But one easily sees in Fig. 12.121 that 
the causal trees are much shorter than the equilibrated ones. If we assume that the average distance 
scales for equilibrated trees like 

(r) h ~ y/N/hiN, 



and for causal trees: 



(r) 



In N, 



(2.100) 



(2.101) 



14 We shall see in the next chapter that this statement is true only in the thermodynamic limit. For any finite N there 
are finite-size corrections, which are different for both ensembles. To see a difference coming from the finite-size correction 
one has to have much better statistics than in figure I2T3I 
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Figure 2.14: Left: G(r) for equilibrated trees plotted in the rescaled variable: x = r/^/N/ In TV for 
different sizes N = 500,1000,2000,4000. Right: the same quantity for causal trees, for x = r/\nN. 
Continuous lines are given by Eqs. (|2,102|) and l|2.103p . 



we can plot curves G(r) for different N in the rescaled variable x = rj (r) and observe that they collapse 
to some characteristic curves but different for each of the two ensembles (see Fig. I2.14p . The function 
Gh(x) for equilibrated trees is well approximated by 

G h (x) = Axexp(~Bx 2 /2), (2.102) 

while for causal trees by 

G c {x) = A'exp(-(x-x) 2 /B'), (2.103) 

with some parameters A, A', B, B', x fitted to data. So again, the average node-node distance is smaller 
for the causal trees than for the equilibrated ones with the same degree distribution. The effect is 
qualitatively the same when one considers simple graphs instead of trees. Thus the causality enhances 
the small- world effect by increasing the relative weight of graphs with clusters of nodes around the oldest 
vertices. 

There are many other differences between the causal and equilibrated networks. We shall give one 
more example showing the difference in degree-degree correlations in both types of trees. A quantity 
which is commonly used to study these correlations is the average degree k nn (k) of the nearest neighbors 
of a node with degree k, defined in Sec. 1.3 and expressed through Eq. f| 1 . 5[) . For uncorrelated graphs it 
is e x {k,q) — kqH(k)IL(q) / (k) 2 and thus 




(2.104) 



which gives a constant value fc nn = 1 + fc for maximally random graphs in the ER model. In general case 
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Figure 2.15: Plots of k nn (k) for ER random graphs (diamonds), random trees with BA degree distribution 
(squares) and causal BA trees (circles). Both causal and random trees are disassortative, but they differ 
in approaching k — > oo. The descent of experimental curves for large k is caused by finite-size effects. 
Theoretical curves (solid lines) are calculated from Eqs. (|2. 105|) and <|2. 106(1 . 



for equilibrated trees with an arbitrary degree distribution it can be shown |38j that 

fcnnW = 2 + 1 ((fc 2 ) - 4) . (2.105) 

This result differs from the corresponding one for causal trees. For instance, for causal trees with BA 
degree distribution |61j, 

fcnn(fc) = i-Z - + - . (2.106) 



2 V 2 k , 

The second moment (A; 2 ) depends on the size N and can be calculated for growing BA trees |62j: 

(k 2 ) = (2-2/N)H(N -1). (2.107) 

Here H(n) = J2'i=i 1/* is the harmonic number. The same formula performs well for random BA trees. 
In figure [2~T5l we plot k nn (k) for BA causal and equilibrated graphs, and also for ER graphs. From 
Eq. H2.105J1 we get k nn (k) — > 2 when k approaches infinity, while for causal trees the limiting value is 
proportional to the second moment of the degree distribution, and thus diverges for N — ► oo. This means 
that the affinity of nodes with higher degrees and their tendency to cluster together grow with the size 
of tree. This is another argument supporting the conjecture that causal trees are more compact than the 
corresponding equilibrated ones. 
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Chapter 3 

Applications to modeling complex 
networks 

In this chapter we present some further applications of the mathematical methods developed so far. 
First we shall quantify finite-size effects in networks. Usually, while discussing networks one calculates 
quantities of interest in the thermodynamic limit N — > oo. As we will show such a procedure may 
lead to neglecting some important effects, which are seen for finite networks. In particular, the node- 
degree distribution exhibits for finite N apparent deviations from the limiting distribution. We will find 
an explicit form of finite-size corrections to the scale-free behavior for growing networks and talk over 
corrections for homogeneous graphs. A second problem which shall be discussed in this chapter concerns 
a very important class of phenomena which describe the dynamics of statistical processes on networks. 
On the example of a zero-range process we will show the usage of techniques developed in the previous 
chapter. 

3.1 Finite-size effects in networks 

In the preceding chapter we discussed some popular models of networks, for which we determined degree 
distribution, clustering coefficient, diameter etc., in the limit of infinite networks. The derivation of exact 
analytical result was possible because in this limit structural constraints, like for example that on the 
sum of degrees, become less important and some of them loose their virtue at all. For instance, we 
mentioned that in the thermodynamic limit the canonical ensemble for homogeneous graphs with fixed L 
is equivalent to the grand-canonical one where L can in principle fluctuat^]. However, it is not the case 
for finite N and one has to incorporate the effect of finite-size constraints into calculations. 

One must be very careful while comparing models solved in the thermodynamic limit to real-world 
networks. For finite TV, some local quantities like node degrees are bounded from above. There are also 
some effects resulting from network's features which are rare but can significantly change the picture for 
small graphs. For example, it is known that in many models, as for instance in the ER model, large 
graphs are essentially trees, because the average number of cycles of finite length is constant and does 
not depend on the network size N. On the other hand, for smaller graphs, short loops play an important 
role. Their presence shapes the network and strongly affects its global properties. 

In next sections we shall discuss one type of finite-size effects, namely that which is related to the 
appearance of a cutoff in the degree distribution of finite networks. We will present our recent findings 
for various graphs and compare them to those from the literature. 

3.1.1 Cutoff in the degree distribution 

As we have pointed out in Sec. 1.3, for any finite network the power-law behavior of the degree distribution 
n(fc) can hold only for values of k significantly smaller than N. Both experimental data and theoretical 
models of scale-free networks indicate that the behavior of n(fc) for k ^> 1 for a finite network exhibits 
two regimes: below some fc ma x it follows the power-law behavior as in an infinite network while above 
fcmax it displays a much faster decay. The characteristic degree fc max which separates these two regimes 
is called a cutoff. Intuitively, the cutoff comes about due to the fact that the overall number of links 
present in a finite, non-degenerated graph is restricted and so is also the degree of each node. Thus 

1 We have shown this explicitly for equally weighted random graphs. A more general situation is considered in |18| . 
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for any finite network the power-law behavior of the degree distribution is truncated. In consequence, 
many quantities calculated on finite networks significantly differ from their counterparts derived in the 
thermodynamic limit. One can see this effect for example when one calculates percolation thresholds for 
statistical systems on networks, like for instance those describing infection spreading for real diseases or 
computer viruses. 

Many attempts were undertaken to estimate the position of the cutoff for different scale-free networks, 
most of them concentrated on sparse networks where the average degree k is fixed. This restricts the 
class of distributions H(k) to those which have a finite mean value, and the power-law tail exponent to 
the range 7 > 2, which is indeed observed for real networks. 



1000 



Article 


a for 2 < 7 < 3 


a for 7 > 3 


(a) [63], homog. simple graphs 


1/(5-7) * 


1/2 * 


(b) [64J, homog. simple graphs 


1/2 * 


1/(7-1) * 


(c) [65], uncorrelated networks 


1/2 


1/(7-1) 


(d) Pseudographs 


V(7-l) 


1/2 


(e) |66J , growing trees 


1/(7-1) 


(f) [34], homog. trees 


1/(7-1) 




Table 3.1: Some results for the exponent a in the cutoff k max ~ N a . A star (*) denotes two inconsistent 
predictions. By pseudographs we understand graphs with self- and multiple-connections with the partition 
function (|2.31fl . for which corrections to the degree distribution (|2-36|) can be found by observation that 
it is equivalent to so called balls- in-boxes model [36], and calculated in the limit of large but finite N 
|67j . In figure on the right-hand side we compare different exponents graphically. Left axis: A; max for the 
network of size N — 1000, right axis: the value of a. 



In general, the position of the cutoff k mayc scales as N a for large N with an exponent a < 1. The 
value of the exponent depends on the type of network. In table 13.11 we collect some values a calculated 
for S-F graphs of various type, together with references to the original papers where the quoted values 
were derived. The exponent a is calculated as a function of the exponent 7 in the power-law tail of 
the underlying degree distribution. In the discussion of S-F networks one should differentiate between 
equilibrated and growing networks, and simple graphs, pseudographs and trees. Note, however, that for 
7 = 3, a = 1/2 in all cases. The value 7 = 3 is in a sense marginal because it separates the regime of 
anomalous fluctuations for 7 < 3 and of normal fluctuations for 7 > 3. In the former case (k 2 ) is infinite 
in the limit N — > 00 while in the latter one it is finite. 

Let us make some remarks on the results in the table. First, the result (a) has been found recently 
in [63] in the statistical ensemble approach. It is in disagreement with the results (b). The authors [63] 
claim that (b) gives only an upper bound on the value of the exponent a for 2 < 7 < 3. Second, the 
scaling for trees seems to be the same for growing (e.g. the BA model) and equilibrated ones. Third, 
equilibrated pseudographs have different cutoffs than simple graphs or trees. Fourth, the result (c) applies 
only to hypothetic uncorrelated graphs with no correlations between degrees of nearest neighbors. Any 
finite network has certain correlations of such type, simply because of global constraints like that on the 
sum of degrees coming from the fixed number of links L. Therefore, as we have mentioned in Chapter 
2, the two-point correlation function e(k,q) ^ e x (k,q) even if we do not introduce correlations explicite. 
For instance, I1(N — 1) can be non-zero for simple graphs but then it is impossible to pick a link joining 
nodes both of degree k — N — 1 as it would stem from Eq. 1|1.4[1 . The authors [65J are aware of this effect 
and conclude that for assortative networks the cutoff should be smaller than the one predicted in table 
13.11 while in case of disassortativity it should be larger. 

The results in the table were obtained with the help of different methods. For homogeneous networks 
many of them were based either on some simple probabilistic arguments or extreme values statistics. 
Those methods allow one to determine the cutoff but not the shape of the function giving the finite-size 
correction to the degree distribution. For growing networks, however, the shape of the corrections can 
be found. In [66] the BA model of growing tree network has been solved for finite TV. The authors have 
calculated the mean number of nodes of a given degree for the network of size N and deduced the form 
of the correction to the degree distribution for the pure BA model with 7 = 3. In the next section we 
shall present a more general method which also applies to other growing networks and we shall use it to 
determine the form of the cutoff function. In the last section we shall present Monte Carlo simulations of 
networks which allow for the estimation of the cutoff function and the exponent a, and we shall compare 
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the results to those in table 13. ll 



3.1.2 Growing networks 

Here we would like to present the method of determining of the cutoff functions for growing networks. 
We shall explain it for the BA growing networks with initial attractiveness |63j described in Sec. 2.2.1. 
Some of results presented here were obtained in Ref. [66]. However, our approach is different and allows 
for solving more sophisticated variants of the model. Before we start, let us give some key points of the 
method here. We begin with the rate equation for the average number of nodes Nk{N) of a given degree 
k. The average is taken over the canonical ensemble of growing trees as in Sec. 2.2.2. The solution of 
the rate equation in the limit of large N gives, up to a normalization constant, the degree distribution 
Tloo(k) for the infinite network. 

For a finite network, we define H(k) as a product of the limiting degree distribution n oo (fc) and a 
cutoff function v{k, N) giving finite-size corrections. The recursion equation for this function can be 
obtained from that for Nk(N), The next ingredient of the method is to consider moments of the cutoff 
function. One can derive recursive equations for the moments from the recursion relations for v(k, N). The 
equations can be solved recursively and one can derive explicit asymptotic expressions for the moments 
for sufficiently large but finite N. The knowledge of all moments makes it possible to reconstruct the 
leading behavior of the cutoff function v(k,N). This is the sketch of the method which we shall explain 
below in details. Although the idea is very simple, its implementation leads to quite complicated and 
lengthy calculations which we omit here, referring the reader to the original paper [62] . 

We start from the BA tree model with m = 1, do = and thus 7 = 3. Like in Sec. 2.2.2, as an initial 
configuration we take the graph with no = 2 nodes joined by a link (a dimer configuration), therefore 
iVfe(2) = 26k,%. At each time step a new node is added and connected to one of N existing nodes in the 
system, with the probability proportional to the number of the preexisting links of the corresponding 
node, leading to a new network with N + 1 nodes. According to Eq. (|2.69[) . the rate equation for the 
average number Nk(N) has the form: 

N k (N + 1) = N k (N) + 4,i + ^"^ ATfe-iW - 2{N k _ 1) W), (3.1) 

where, for brevity, we have omitted the angle brackets denoting the average. The origin of all terms has 
been already explained in Sec. 2.2.2. This equation is exact for any N. In the limit of N — ► oo it has a 
solution given by Nk(N) » NHBA(k), where 

H - (fc) = k(k + l)(k + 2) ^ 

is the degree distribution in the BA model. Here we are, however, interested in the general solution for 
Nk(N) = NU(k,N), with H(k,N) being the degree distribution for a finite network. It is convenient 
to split n(fc, N) into the product of the known function YlsA^k) and an unknown function v(k,N) 
giving finite-size corrections. With the substitution Nk(N) = TlBA(k)v(k, N), the equation l|3.ip can be 
rewritten in terms of v(k, N): 

3, 2 + k ^ A~2N + k , s , s 

v(k, N) = -4,1 + 2(jv _ 2) «(fc — 1,N — 1) — v(k, N - 1). (3.3) 

Multiplying now both sides of Eq. (|3.3j) by k q and summing over k = 1, . . . , oo we get 

m q (N + 1) = ^ ^3 + c gi mi(N) + (2N + q + l)m q (N^J , (3.4) 
where define moments m q (N) for the distribution v(k,N) as follows: 

m q (N) = w — [ J2 kqv ( k > N ^ ( 3 - 5 ) 
fe=i 

The normalization constant 1/{N — 1) has been chosen for the later convenience. The initial condition 
reads 

m q {2) = 3, (3.6) 



35 



for all q as can be found for the initial configuration. The coefficients c q i are given by: 

c g0 = 3, and = 2,(^J + ( g _^\ for i > 0. (3.7) 

The equation (|3.4p can be solved recursively starting from the lowest moments m , mi , m 2 , ■ ■ ■ . Prom 
expressions for the first moments we can infer that the general solution has the form: 

^W = f^gfT2Tk r ^ + i/2) ' (3 - 8) 

with some coefficients B q i, yet unknown. The equation for coefficients B q i can be found by inserting 
(|3.8p into Eq. (|3.4p . For large N, the leading behavior of m q (N) is controlled by the term proportional 
to B q:q+ i: 

with A q = i? 9j9+ i/r((5 + g)/2). Each two consecutive moments m q+ i and m q differ by a prefactor TV 1 / 2 , 
so clearly the cutoff function must have the form: 



v(k, N) ~ Nw{k/VN), (3.10) 
where w(a;) is a universal (independent of N) cutoff function having moments equal to A q : 

/>oo 

A q = dxw(x)x q . (3.11) 

JO 

Therefore, the leading correction to the degree distribution for a large but finite BA tree network is 

U(k, N) = U BA (k)w (-^=) . (3.12) 



N 

The exponent a = 1/2 stemming from this equation agrees with the result for trees presented in Table 
13.11 The function w(x) can be found in two ways. First, we can evaluate numerically Eq. p,3j) for some 
large N and then rescale variables according to Eq. 1|3.10|) . Second, it can be obtained analytically by 
reconstructing it from the moments A q , which express through the coefficients B q ^ q+ \. Without going 
into the details we quote the result for the moments A q [62] : 

_ (2 + g)V 

^-r((3 + ,)/2)' (3 ' 13) 

Using the asymptotic behavior of Eq. (|3.13p we can infer the form of the cutoff function w(x) for large 
values of the argument: 

\nA q w iqlng. (3.14) 
Let us now compare Eq. (|3.14p with the behavior of moments I q of the function exp [— (x/a) p }: 

r 9+l 



r°° a q+1 /<7+l\ 

I q =j x q e^[-{x/a) p ]dx= — T i ■ ( 3 - 15 ) 



For large q the leading term of ln/ 9 w {qhxq)/p with p = 2 is the same as in Eq. (|3.14p . i. e. the tail 
of w(x) defined by its higher moments falls like a Gaussian. The parameter a is found by comparing 
sub- leading terms in I q and A q . The value a = 2 obtained in this way will be confirmed below by a direct 
calculation of w(x). To this end we define a generating function: 



M(z) = J2 Aq ^- ( 3 - 16 ) 
Comparing this definition with Eq. (|3.1ip we see that M(z) = exp(zx)w(x)dx so that 

/•oo 

M(-z)= / exp(-zx)io(x) dx (3.17) 
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is the Laplace transform of w(x). Therefore w(x) is given by the inverse Laplace transform of M(z) or, 
equivalently, by the Fourier transform of M{— iz): 



w{x) 



1 

2ni 



dze zx M(-z) 



1 

2^ 



dz e lzx M(-iz). 



Using the explicit form of coefficients A q we get 



^r( g + 2)r((3 + 9 )/2) 2 



(3.18) 



(3.19) 



This series has an infinite radius of convergence. The function M(z) given by Eq. (|3- 19|) is a special case 
of a more general power series: 



M(z) = AAV ( a g + b ) r (g + z , 



(3.20) 



belonging to the class of so called Fox- Wright \& functions [681 Ifi9]. In [62] it has been shown that its 
inverse Fourier transform, that is w(x), can be expressed through a combination of auxiliary functions 
fx,4>,£,(,( x )- I n general, they are defined as follows: 



r(g- 8 )r(i- a ) 
r(c-s)r(v>- X s)' 



(3.21) 



where the sum runs over all points s, at which either T(l — s) or T(£ — s) has a pole. The above formula 
simplifies for £, £ being positive integers m, n: 



f M _ y r x q (m-2-g)(m-3-g)---(Tt-l-g) 

^ r(V> - x - xw 



The final formula for w(x) for arbitrary x-> tyi £j C reads 



(3.22) 



(3.23) 



where / is given either by Eq. Q3,22p or by more general Eq. (|3.2 1[1 . In our case, which corresponds to 
Af = 1, a = 1, b = 2, x = 1/2, ^ = 3/2, £ = 3, C = 2, the function u;(x) is given by 



= Xf 1/2 ,l,2,l(x) + /l/2,3/2,3,2(a0 = 



9=0 



r(l/2-g/2) r(l-g/2) 



(3.24) 



After some algebraic manipulations we get the function w(x) expressed as an infinite series: 



w(x) = 1 — 



9=1 



.2g+ 



i (-i)V 

g!2 2 <?(2g + 1) ' 



One can check that it corresponds to a Taylor expansion of the result given in [66] : 

w(x) = erfc(x/2) + 2x +f e - x '' 4 , 

V47T 



(3.25) 



(3.26) 



where erfc(z) is the complementary error function. The approximate result of Ref. [70] is close to this 
exact formula. The series in Eq. I|3.25p is rapidly convergent, and if truncated at some g max , can be used 
in numerical calculations. One sees that the cutoff is indeed of a Gaussian type, with the variance a 2 = 4 
in agreement with the asymptotic behavior of A q discussed above. 

These calculations can be easily extended to the case of an arbitrary initial graph. For example, if we 
assume that at the beginning we have a complete graph with no nodes, after repeating all the steps of 
calculations, we will obtain the following formula for the moments: 



r(l + n — uj) 
T(n + 3/2 - lu + q/2) 



r(4 + g) m (rao)r(2 + no + g) 
2(g+l) T(n + 2) 



exp ( -glng + 0(g) ) , (3.27) 
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Figure 3.1: Plots of the cutoff function w(x) calculated from Eq. (|3 . 28[) for the BA model without initial 
attractiveness, for the initial graph with n = 3 (black lower line) and 5 nodes (black upper line) agree 
very well with w(x) obtained from averaged degree distribution for 2 x 10 4 generated networks of size 
N = 10 4 (gray lines). 



with to = no(3 — tlq)/2 and the zero-th moment mo(no) = (no + l)n.o/2 being just the number of links 
in the initial graph. This allows us to infer the asymptotic behavior of w(x) which is the same as for 
no = 2. Therefore, the degree distribution for the BA tree model without initial attractiveness has always 
a Gaussian cutoff whose position scales as ~ N 1 / 2 . The full expression for w(x) takes the form: 



w(x) = r(l + no — to) 



,4,2 



Or) 



m (n ) 
T(n + 2) 



fi 



,710+1—^,2+710,1 



(,•) 



(3.28) 



In figure 13.11 we have plotted w(x) calculated from Eq. (|3.28|l together with the results of Monte Carlo 
simulations for finite-size networks. One readily sees that w(x) strongly depends on the size of the seed 
graph no. This sensitivity to the initial conditions has just been reported in [66] as well as in Ref. [71] 
where another quantity has been measured. 

Let us go now to the case of preferential attachment kernel k + ao , that is to the model with initial 
attractiveness ao > — 1. From Eq. (|2.66|) we know that H oc (k) ~ fc~ 7 with the exponent 7 = 3 + ao. 
Like we said, the model is equivalent to the growing network with re-direction (GNR model), described 
in previous chapter, with the choice of the parameter r = l/(ao + 2). In all numerical simulations showed 
in this section the GNR model is used. On the other hand, in analytical calculations we shall follow 
the procedure, which we describe above for the pure BA model. Assuming that we start from the dimer 
configuration, using the recursion formula for Nk(N) we get an equation for the moments m q (N) which 
can be solved in the form which involves some (yet) unknown coefficients B qq+ i: 



m q {N) 



1 



r in- 1 



2+a 



(N-l) l=0 



9+i t(n+^) 



r 2 



I o i ap+i 1 
\ Z 2+a J 



(3.29) 



For large but finite N it simplifies to 



m q (N) ~ N*+°oA q , 



(3.30) 



with A q = B q , q+ i/r 



9 I ap+q+1 
L + 2+a 



. Therefore, the function v(k, N) obeys the following scaling rule: 



v(k,N) -> Nw (k/N 2 +"o 



(3.31) 



where the function w(x) has now moments A q depending on ao. Equation 1|3.31|1 indicates that the cutoff 
scales as TV 1 ^ 7 " 1 ) where 7 = 3 + ao is the exponent in the power law for ILx^fc). This is in agreement 
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with the result presented in table I3.ll For given N, the cutoff decreases when the exponent 7 increases. 
Practically, this implies that the power law in the degree distribution can hardly be seen for 7 > 4 because 
even for large networks with N — 10 6 nodes the cutoff corresponds to the value of k max ~ 100 and the 
power-law extends maximally over 1 — 2 decades in k. This partially explains the fact why S-F networks 
with 7 above 4 are practically never encountered [2]. 
The moments A q can be found: 



r y 1 + 2TT ) [6(g + 2) + 00(13 + 4a + 3g)] T (4 + 2a + q) 

Aq ~ 2 ^ < ^ : - ^ +3ao +q \ ' (3 - 32) 

2+qo J 



T(5 + 2ao) (1 I q)T ^ 5 + 3a o+g ^ 



The equation (|3.32|) is much more complicated than Eq. f|3.13fl but it reduces to it for a = 0. The leading 
term of A q is 

\nA q ^ l±^. q ]nq. (3.33) 

Comparing this to Eq. I|3.15p as it has been done before, one sees that for large x the function w(x) 
decays like exp [— (x/a) p ] with 

2 + «o 7- 1 ,„ Q/n 

p= —r—2- (3 - 34) 

This agrees very well with numerical results. The cutoff for 7 ^ 3 is no longer Gaussian. For 2 < 7 < 4, 
as often found in real networks, p is always greater than 1.5 and therefore the finite-size cutoff cannot 
be approximated by a pure exponential decay, observed in some networks [2]. Exponential cutoffs found 
in such networks probably have different origin [72]. The formula for M(z) is still given by Eq. (|3.20[) 
with the parameters a, b, . . . expressed through ao- The cutoff function w(x) is given by Eq. (|3.23|) . For 
instance, for ao = —1/2 which corresponds to 7 = 2.5 we get 



T(2/3) ^ (_a.)9 



w{x) = — - — }^ ■ 
6 q =o 



-9q 2~2q 



2r(l-2g/3) r(5/3-2g/3) 



(3.35) 



In Fig. l3.2l we plot w(x) for ao = —1/2, and 1. For numerical calculations all series have been truncated. 
The truncation error is less than 10 -4 in the plotted area. The results show that the curves become more 
flat when ao increases and agree well with w{x) obtained in simulations of finite-size networks. 

The initial graph has a great influence on the functional form of w(x). We do not consider here the 
dependence on the size no of the seed graph, but one can show that asymptotic properties of the cutoff 
function are insensitive to no and therefore for x being sufficiently large, w{x) ~ exp [— [x/a) p ] depends 
only on ao, i. e. only on the exponent 7 in the power-law n(fc) ~ . 

So far we have considered the model with m — 1, restricting ourselves to the case when graphs are 
essentially trees and possible cycles can only come from the seed graph. The general case m > 1 is much 
more complicated. Each of m proper links of a newly introduced node has to be connected to one of 
N preexisting nodes according to the preferential attachment rule. However, since multiple links are 
not allowed, the nodes to which links have been connected on this step have to be excluded from the 
set of nodes available for further linking. Thus, when a new proper link of a node is introduced, the 
probabilities of attaching it to one of the preexisting nodes are different depending on whether the link 
is the first, second, etc., of m. The rate equation for Nj~{N) can still be obtained in this case, but its 
structure is very involved. For example, for m = 2 and no = 3 (triangle as a seed graph) the full rate 
equation for Nj~{N) reads: 



N k (N + l) = N k (N)+S kl2 + ' 



47V- 6 



(fc ~ 1) ' 1 ~ 4jV-5-fc + ^) iVfc - l(iV) 



(3.36) 



where Sn denotes the auxiliary quantity: 



" kN k (N) 



47V - 6 - k ' 
fe=i 



Due to the presence of the Sn term, Eq. (|3 .36(1 is nonlinear in N k , and contains k in denominators. This 
makes impossible to apply our method in a straightforward way to the exact equation. In this case the 
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Figure 3.2: Plots of w(x) calculated from Eq. (|3.23j) for a = —0.5, and 1 (solid lines from top to 
bottom) which correspond to 7 = 2.5,3 and 4, respectively. The curves become flat with increasing a . 
The thick gray lines are w(x) obtained from averaged degree distribution for 2 x 10 4 generated networks 
of size N = 10 4 . The tails decay as exp(— x p ) with p — 3, 2, 3/2, respectively, in agreement with numerical 
findings. 



approximations are needed. The equations for m > 2 are even more complicated because of increasing 
number of possible ways of distributing m links at each time step. 

We can, however, make the following approximation. When the total number of links L is large, the 
probability that at each step we pick up two or more links pointing onto the same node is small. When 
the size no of the initial graph is much larger than m, this condition is fulfilled from the beginning and we 
expect that this approximation should work good. Within this approximation, the rate equation takes 
the form: 

N k (N + 1) - N k (N) + S k , n + J N ~\ Xk-l(N) ~ 2 (N-u;) NkiN) - 

The Kroenecker delta stands for the addition of one node with m links at each time step. The remaining 
terms give the probability of preferential attachment like in Eq. (|3,lj) . The denominators must give the 
normalization J2k kN k {N) — 2L. Assuming that we start from a complete graph with no nodes, we get 
the number of links L = m(N — lo) with 

w = n (2m + 1 - n )/(2m). (3.39) 

The factor m coming from 2L in the denominator cancels out with m coming from m possibilities of 
choosing links at each step. The choice of the same name "u>" above, as in Eq. (|3.28|) is not accidental. 
In fact, the cutoff function w(x) is now given by a very similar formula: 



W(x) = r(l + HQ — (jj) 



1 7 , s, m (n ) ~ 



T ( m + 2y^' no+ ^- ul ' i+m ^ ' I>o + 2)' 



(3.40) 



with mo (no) = no (no + l)/(m + 1), which takes the form of Eq. I|3.28p for m = 1. Thus the same scaling 
fcmax ~ N 1 / 2 holds also here. 

Like before, we expect some dependence on the initial graph, but as far as the asymptotic properties 
of w{x) are concerned the dependence should be negligible. Therefore, one should take the simplest 
possible initial configuration. The most natural choice is the complete graph with n = m + 1 because 
then H(k) = for all k < m at each step of the growth process. However, one should remember that 
the approximation used here works well only for m< no because Eq. Q3.38P with u) given by Eq. (|3.39p 
approximates reasonably the full rate equation only if m <§; N at each stage of the network growth. 

In figure 13.31 we compare our approximate analytical solution with Monte Carlo simulations of BA 
networks initiated from complete graphs with different no. One can see a small deviation between the 
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Figure 3.3: The function w(x) for m — 2 and for hq = 3, 7, 10 and 15 (curves from the flattest to the 
most peaked) . The agreement between the analytical solution and the results of computer simulations is 
not as good as before due to approximate character of the solved equation. The experimental points are 
obtained for N = 4000. The dip at about x = 1 and the peak at about x = 2 especially pronounced in 
the case of larger seed graphs mean that much more nodes with high degrees is present than it would be 
expected for the asymptotic power-law behavior of n oo (fc). 

analytical and numerical curves. The largest difference is for n = 3 and the smallest for n a = 15, 
confirming our earlier conclusion that the approximation is better for larger seed graphs. The asymptotic 
form of w(x) ~ exp(— x 2 /4) is the same as for growing BA trees, regardless of how many new nodes m 
we add per one time step. 



3.1.3 Numerical simulations of equilibrated networks 

We have performed extensive numerical simulations of various networks to cross-check analytic results 
for the scaling of the cutoff function. The results for growing networks have been just presented in the 
previous section. As we saw, they were in a very good agreement with theoretical predictions, and also 
consistent with earlier results presented in table 13.11 Now we shall describe results of the numerical 
computation of the cutoff for equilibrated graphs. We used the Monte Carlo generator described in 
Section 2.1.5 and in Ref. [53]. It performs a weighted random walk in the configuration space of the 
canonical ensemble. Each elementary step of the random walk is done using the T-rewiring and accepted 
with the Metropolis probability. We simulated three ensembles: equilibrated trees, equilibrated simple 
graphs, and equilibrated degenerated graphs. In all the cases we used basically the same algorithm except 
that in the first one we rejected rewirings violating the tree structure by introducing a cycle; in the second 
we rejected moves leading to multiple- or self-connections. This resulted in lowering the acceptance rate 
and algorithm efficiency, especially for tree graphs, in which case we had to extend the simulation time 
appropriately. 

In simulation of trees, as an initial configuration we chosen a GNR network with ao = 7 — 3 adjusted 
to have the desired value of the exponent 7 in the tail of the node degree distribution. The asymptotic 
degree distribution, given by Eq. (|2.66p . has (fc) = 2 for infinite GNR trees as it shoulcH. In order to 
preserve this distribution in the process of homogenization we had to set the ratio- weight function w(k) 
according to Eqs. (j2.49|) and <|2.59[) : 

< 3 "> 

This choice ensures that the mean value of the degree distribution stays at its "critical" value equal to 
k = 2L/N w 2 for trees. For densities of links below the critical one, one would observe an exponential 



2 Since L = N — 1, the average degree is in fact 2 — 2/N but it converges fast to the asymptotic result. 
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suppression of the degree distribution for large k, and for the average degree above two, the scale-free 
behavior would be disturbed by the surplus of highly connected nodes, or even by a condensation of links 
at some singular node. We simulated ensembles for three exponents 7 = 2.5, 3 and 3.5 for which the 
scaling exponent a should be 0.667, 0.5 and 0.4, respectively. For each of them we took trees of two 
different sizes N = 1000 and N = 2000, and for each we made between four and six independent runs 
in order to estimate errors by means of the standard Jackknife method [73]. We measured the degree 
distribution as well as a cumulative degree distribution (c.d.d.) defined as 

00 

P(fc) = ]>>(<?)• ( 3 - 42 ) 

q—k 

Because the degree distribution has the power-law tail II(fe) ~ fc -7 , the corresponding cumulative distri- 
bution behaves as P(k) ~ fc~ 7+1 . Any cutoff effects should be clearly visible also in P(k). The advantage 
of using the cumulative distribution is that one does not need to make binning to reduce statistical errors. 
One makes a rank plot instead. From Eq. (|2.66|) we get the following formula for theoretical P(k) for 
infinite graphs: 

For any large but finite network we expect, similarly as in section 3.1.2, some cutoff so that P{k) « 
Poo{k)V {k / N a ) . Here V(x) would be some universal function. If it is so, we should observe a collapse of 
curves P CX p(x) / 'Poo(x) plotted in the rescaled variable x = k/N a . In figure l3~4l we see that such a collapse 
indeed takes place for 7 = 3 and 2.5. This means that theoretical values of a (see table 13. lj ) agree very 
well with the experiment. However, for 7 = 3.5 the collapse is much better for a = 0.55 ± 0.03 than for 
the theoretically predicted value 0.4. This means that there are more nodes with high degrees than it 
is expected. In [65] it has been suggested, that in the case of disassortative networks like equilibrated 
trees presented here, the cutoff might be higher than 1/(7 — 1). On the other hand, the assortativity 
coefficient A increased with 7 so the assortativity is bigger for 7 = 3.5 than for 2.5 where we observe a 
perfect agreement. So it is not clear whether indeed the argument of Ref. [65] is entirely correct. 

Let us now discuss equilibrated simple graphs. We will generate weighted graphs with the degree 
distribution <|2.66[) . In our Monte Carlo generator (see Sect. 2.1.5) we have to set the weight from 
Eq. ([2759]) to 

- 

in order to get the stationary distribution given by Eq. I|2.66p in the limit of N — > 00. As before, we must 
keep the average degree equal to 2, which is the mean value of the distribution (|2.66p . As the initial graph 
we have chosen again a GNR tree, because from the very beginning it has the correct degree distribution 
equal to that produced by the graph rewiring process in the course of thermalization. The final results 
are the same when one begins with any random graph with N = L but the convergence to the asymptotic 
distribution might be in this case much slower. 

We simulated four ensembles: with 7 = 3, 3.5, 2.5 and 2.1, each of them for three sizes N — 2000, 4000 
and 8000. The acceptance rate of the algorithm was better than in case of trees and thus we were able 
to examine larger systems. Before starting simulations we suspected that the data would collapse to a 
scaling function V(x) better than for trees, because of less structural constraints. Surprisingly, as we see 
in figure 13.51 the collapse is worse and moreover, it takes place for different values of a than those give 
in table 13. ll and predicted in either [63] or [64]. In particular, for 7 = 3 where one expects a — 1/2, we 
measured 0.38 ±0.02. For the case 7 = 3.5 we found that the measured value 0.40 ±0.02 agrees with |64] 
which predicts a = 0.4, while for 7 = 2.5, a — 0.35 ± 0.03 is closer to the result of [63] which predicts 
a = 0.4. For 7 = 2.1 (not shown in the picture) we found a — 0.33 ± 0.01, which also agrees quite 
well with [63j. To summarize, the numerical results presented in this section do not give a conclusive 
evidence which of the theoretical predictions, |63] or [64] , for the scaling exponent a of the cutoff is better. 
Actually, for 7 > 3 the numerical value is closer to that of |64J while for 2 < 7 < 3 to that of [63J. One 
should, however, keep in mind that the numerical results are based on relatively small systems. For such 
networks, subleading finite-size corrections may be important and may interfere with the leading scaling 
behavior fc max ~ N a . The question how a depends on 7 for equilibrated S-F networks remains open. 

Let us now discuss Monte Carlo results for pseudographs. As before we simulated ensembles for 
7 = 2.5, 3 and 3.5, for N = 1000,2000,4000. The weight function w{k) is the same as in Eq. (|3~44j) . 



3 It can be shown that for the scale-free degree distribution H2.66|l . Tre grows with increasing 7 and so grows the coefficient 

A 
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7 = 3 7 = 2.5 7 = 3.5 




Figure 3.4: Top: plots of experimentally found cutoff function V(k, N) for equilibrated trees, plotted in 
the rescaled variable x — k/N a . The curves for N = 1000 and TV = 2000 collapse nicely for theoretical 
values of the cutoff exponent a for the first two plots (7 = 3 and 2.5) but significant difference is observed 
for 7 = 3.5. A better collapse has been found for a = 0.55 ± 0.03 (inset) than for theoretical value 0.4. 
Bottom: plots of rescaled cumulative degree distributions N a ^~ x ^ P{x) versus x. 
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Figure 3.5: Top: plots of experimentally found cutoff function V(x) in the rescaled variable x = k/N a 
for equilibrated graphs of different sizes, for 7 = 2.5, 3 and 3.5. Top: plots for a chosen to obtain the 
best collapse of data (by-eye fit). Bottom: the same data but for a predicted in papers [63] and [64] 
(the same result for both) for 7 = 3, in |63j for 7 = 3.5 and in [64] for 7 = 2.5, in order to show the 
discrepancy with numerical simulations. 




Figure 3.6: Experimentally found V(x) for equilibrated pseudographs of different sizes, for 7 = 2.5, 3 
and 3.5. The exponent a shown in figures gives the best collapse of curves. 
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In figure l3~6l we show the function V(x) obtained from rescaled numerical data: x — k/N a . The exponent 
a has been chosen in order to obtain the best overlap of the data for different N. We see that for 
pseudographs the numerical results are consistent with theoretical predictions. In table 13.11 we have 
a = 1/2 for 7 = 3, 3.5 and 2/3 for 7 = 2.5, while the corresponding exponents determined experimentally 
are 0.55 ± 0.03, 0.51 ± 0.02 and 0.66 ± 0.02, respectively. For 7 = 3 the experimental value deviates a 
bit from the theoretical ones, while for the two remaining values of 7 the agreement is perfect. It is not 
surprising that experimental results for pseudographs exhibit the best agreement with theory, since they 
have almost no constraints on the structure of graph. In fact, they may be effectively reduced to the 
balls- in-boxes model [36], which shall be discussed in next section. 

3.2 Dynamics on networks 

So far we have discussed networks as purely geometrical objects. However, many networks represent 
not only relationships between nodes but often they are viewed as a backbone of the complex system on 
which signals, matter, or other degrees of freedom can propagate. For example, networks are a convenient 
way of thinking about flows, transport, signal propagation, and information spreading between different 
objects. In other words, one is often interested in dynamics of some degrees of freedom which reside 
on the network and undergo some dynamical evolution transmitted by the structure of the network. A 
good example is the air transportation network with airports as nodes and flight connections as links. 
The dynamical variable in this case is the number of passengers or goods. The flow is proportional 
to the intensity of the air traffic between airports. Another example is the voter model |74[ [75] which 
is used to mimic an opinion formation among different individuals. The quantity which is this case 
"transmitted" on the network of acquaintance relations is the opinion. The distribution of opinions on 
the network is represented by discrete variables defined on nodes. In the simplest version, nodes change 
their state by copying the state of a randomly chosen neighbor. There are many other examples: the 
traffic of data packets in the Internet [76], epidemic spreading ("ill" nodes infect other nodes) [7711781 179]. 
synchronization of coupled oscillators |80 [ 181" ! [82]. etc. 

It is worth mentioning, that the observed dynamical properties in models defined on scale-free networks 
are often quite different from those on regular lattices. It is so because S-F networks have two important 
properties mentioned at the beginning of this thesis: the distance between every pair of nodes is relatively 
small, and the power law in the degree distribution leads to the emergence of nodes with a high number 
of connections. This inhomogeneity in the degree distribution seems to play a very important role. In 
this chapter we shall examine the influence of the node degree inhomogeneity on the dynamics of systems 
defined on networks. We shall use a very simple model, called zero-range process. This model is easy to 
handle analytically. For example, one can show that its steady state has a simple factorized form which 
makes it possible to solve the statics of the model analytically. Despite its simplicity, the model exhibits 
a very interesting behavior, e.g. it has a phase transition between condensed and uncondensed state. 

In the first subsection we shall define a zero-range process and review its basic properties. In particular, 
we shall discuss the criterion for the condensation on homogeneous and inhomogeneous networks. Next, 
we shall discuss in details a derivation of most important results in the statistical ensemble approach. 
In subsection 3 we shall study the dynamics of the condensate. We shall describe how the condensate is 
formed from a diluted state and how it behaves once it is formed. For example, we shall ask how much 
time the condensate needs to disappear from the node it occupies. In the last subsection we shall show 
how to obtain scale-free fluctuations in the inhomogeneous system. 

3.2.1 Zero-range process 

The zero-range process (ZRP) is a particularly simple diffusive system which describes dynamics of 
balls (particles) on a given network. The balls hop from site to site on the network and the hopping rate 
depends only on the number of balls at that site from which the ball hops. Despite its simplicity the model 
exhibits many interesting properties like phase separation, phase transition, long-range fluctuations and 
spontaneous symmetry breaking, observed in more complicated systems with mass transport. Therefore, 
it has attracted a great attention recently [83l (EH [85l [MJ [8Zl EE] ■ In comparison to more realistic models, 
it has one advantage: the steady state of the system is known exactly and it assumes a very simple, 
factorized form. It is worth to mention here that static properties of the model are the same as in the 
balls-in-boxes model 36. 8.9] developed earlier and successfully applied to explain such phenomena as 
for instance wealth condensation [90] , emergence of the Hagedorn fireball in hadron physics [911 [92] or 
a collapse of random geometry in the quantum gravity [EH [93] . Although the statics of the zero-range 
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process is relatively well known, its dynamics, in case when it takes place in an inhomogeneous system, 
has not been yet so thoroughly studied. 

We shall consider a zero-range process on a connected simple graph with N nodes and L links. Each 
node i of the graph is occupied by rrii > identical balls and the total number of balls is fixed and equal 
to M. The system undergoes the following dynamics: balls hop from non-empty nodes with rate u(m), 
which depends only on the node occupation number m, to one of the nearest neighbors, chosen with 

equal probability. The function u(m) is any non-negative function defined for m = 0, 1, 2, For a node 

which has k neighbors, the hopping rate per link emerging from this node is equal to u(m)/k since each 
link is chosen with equal probability 1/fc. 

It is easy to implement this type of dynamics on a computer. At each time step we pick N nodes at 
random. From each of these nodes, occupied by at least one ball, a ball is moved to a node chosen with 
equal probability from its ki nearest neighbors. The move is accepted with probability proportional to 
u(rrii), otherwise it is discarded. The jumping rate u(mj) must be properly normalized to be less than 1. 
With such a definition, one unit of time corresponds to one sweep of the system that is to N attempts of 
moving a ball. 

The ZRP has a steady state. Following Ref. [94], we shall present here a short derivation for an 
arbitrary network having adjacency matrix Aij. We are interested in the probability P{m\, . . . , mjv) 
of finding the system in a particular state with given number of balls on each site. In the stationary 
state, this probability must be constant, as a result of balance between hopping into and out of the given 
configuration: 



u(mi)P(mi, . . . ,m N ) 



^ Ai 3 ^-u(mj + 1)P(. . . ,m 3 ■ + 1, . . . , mi - 1, . . . ) 



(3.45) 



for each node i. The sum over j includes only neighbors of node i, each of them gives the contribution 
oc 1/kj since it has kj — 1 other neighbors than i. Assume now that P(m\, . . . ,mjv) factorizes into some 
functions /i(m,): 



JV 



P(m u ...,m N )= Z( ^ M) II km), 



(3.46) 

where Z(N, M) is an appropriate normalization. Inserting this formula into Eq. (|3,45p we have: 

0. (3.47) 



T-u(mj + l)fj(m,j + l)M m i ~ !) - }«(m,)/ 1 (m,)/j(m J ) 



This equation is fulfilled only if each term of the sum over j vanishes separately: 

—uinij + 1) — = = —uirrii)— . 

k j /iK) ki Mrm- 1) 



(3.48) 



The left-hand side depends on m 3 while the right-hand side on mj. To be equal for any m; and rrij they 
have to be a constant function, independent of m. Without loss of generality we can set it equal to one. 
We get: 

k 

fi(m%) = i 1 , fi(mi - 1). 



u(rrii)' 

Iterating this equation we come to the formula for fi(m): 

/<(m) = fcr/K), 

where we have introduced the function f(m) which is independent of ki and reads: 

m 1 

/M=n^ /(o)=i- 



(3.49) 



(3.50) 



(3.51) 



The splitting into a site-dependent and a site-independent part is convenient when one considers regular 
graphs having all degrees fa the same. The partition function Z(N, M, {fa}) being a sum over all states 
mi, . . . ,mjv has the form: 



M 



M 



N 



Z(N,M,{fa})= J2 •■• E ^muuUf^i 



(3.52) 



mi-0 rriN—0 



i=l 
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Here the Kronecker delta gives the conservation law of the total number of balls. For convenience we shall 
denote Z(N, M, {ki}) of the original graph in short by Z(N, M), skipping the dependence on the sequence 
of degrees {ki}. The partition function (|3 . 52[) encodes the whole information about static properties of the 
system. It depends only on the node degrees and a detailed structure of the graph has no meaning. As we 
shall see, also dynamical quantities, like the typical life-time of the condensate, are mainly characterized 
only by the degree sequence {ki}, if the graph has a small diameter. 

In order to study static and dynamic behavior of the ZRP it is convenient to define an effective 
distribution of balls 7Tj(m), that is the probability that site i is occupied by m balls, averaged over all 
configurations: 7Tj(m) = (# m , mi ). It can be calculated as follows: 



n{rrn) = ^2 ' ' ' ' ' ' -^ 3 ( mi ' • • • ' m N) $J2i=i mi,M = z(N ~M) ~ fc "^( mi )' ( 3 " 53 ) 

mi TMi_i m^i jtin 



where Zi(N — 1, M — mi) denotes the partition function for M — mi balls occupying a graph consisting 
of N — 1 nodes with degrees {ki, . . . , fcj_i, fcj+i, . . . , kpj}. It is important not to think about Zi as of 
partition function of the original graph with the ith node removed, but rather as of a new graph built of 
the old sequence of degrees without ki. We define also the mean occupation probability as the average 
over all nodes: 

7r(m) = (l/JV)^7Ti(m). (3.54) 

i 

It is worth mentioning that for graph with fci = • • • = fcjv = k = const, that is for a /c-regular graph, the 
above formulas reduce to that known from the balls- in-boxes model and the distribution 7T^(m) = ir(m) 
is the same for all nodes. We will see below that sometimes 7r(m) is indeed equal to f(m). Therefore we 
shall call /(to) "bare" occupation probability. 

The partition function can be calculated recursively: 

M N-l 
Z(N, M, {k 1: k N }) = ]T f N (m N ) Yl fe 1 m t M-m N II 

m N — mi,...,mjv-l i=l 

M M 

= fN(m N )Z N (N-l,M-m N )= z 2Mrn)Z(N-l,M-m,{k 1 ,...,k N ^ 1 }). (3.55) 

mjv— m— 

For N = 1 the partition function reads simply Z(1,M, ki) — fx(M). The formula i|3.55p can serve for 
numerical calculations of the partition function for systems of order few hundreds nodes or more. Using 
it together with Eq. (|3.53|) we can compute the distribution of balls in a more efficient way that by Monte 
Carlo simulations. 

The knowledge of the partition function allows one to calculate correlation functions of higher order. 
For small systems we can calculate Z(N,M) exactly from Eq. (|3.55p . for large systems it is better to use 
the definition (|3.53|) which allows for some approximations. In the thermodynamical limit of N — * oo 
it is therefore convenient to introduce the density of balls p = M/N and to study the limit of fixed p 
while increasing N, As we shall see below, for large systems, i.e. for N 7 M large, p becomes a relevant 
parameter of the model. 

The dynamical and static properties of the ZRP depend strongly on the function u(m) and the degree 
sequence {ki}. For the model described here we can distinguish two classes of systems. From now on, 
by a homogeneous syste we shall understand the network with all fej's being equal. It is true for 
instance for a complete graph, one-dimensional closed chain or a /c-regular random graph. In contrast, 
an inhomogeneous system has a non-trivial degree sequence, with at least one degree different from 
others. This is the case for random graphs, star graphs and scale-free networks. 



3.2.2 Condensation in the ZRP 

The reason why zero-range processes are so interesting is that under some conditions one observes a 
phenomenon of "condensation" in the steady state. In this phenomenon, a single node takes a finite 
fraction of all balls present in the system. The effect does not disappear in the thermodynamic limit. 
The condensation can be observed in the occupation distribution 7r(m) as a separated peak, whose position 
moves almost linearly with the system size. Unlike the Bose-Einstein condensation which takes place in 

4 We mentioned before, that equilibrated networks were sometimes called "homogeneous networks". In this paper, however, 
we shall always use the term "homogeneous" while speaking about networks with equal degrees. 
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the momentum space, the above effect appears in the real space. Therefore it mimics such processes like 
the mass transport [83J, condensation of links in complex networks [35, 84] or phase separation |95} 196), 
The condition for the emergence of the condensation in homogeneous systems is well known [671194] . 
On the other hand, until now only a few inhomogeneous systems have been examined [97, 98) 99]. In this 
section we summarize some results and discuss methods of derivations for the existence of condensate. We 
begin with homogeneous systems and then we present our recent results for graphs with inhomogeneous 
degrees. 

Homogeneous systems. For fc-regular connected simple graphs, which we shall consider in this 
section, the static properties of the steady state depend only of the hopping rate u(m), the number of 
nodes N and the number of balls M, and are independent of the details of graph topology. The partition 
function assumes the form: 

M At N 

z(n,m)= ••■ E ^"k.mII/K)- ( 3 - 56 ) 

m-i—0 mjv— i— 1 

The factor k M has been dropped since it is constant for given k and M. Similarly, for the distribution of 
balls we have 

Z(N-l,M-m) t , 

7r(m) = zJnJT) /(m) ' (3 ' 57) 

From the definition <|3.56j) of Z(N, M) we can obtain another formula for the distribution of balls: 

, ,, ,d\nZ(N,M) 
n to = AT' 1 / to ) \ \ 3.58 

and hence 7r(m) is proportional to the derivative of the "free energy" InZ and the bare occupation prob- 
ability /(to). Notice a similarity between this formula and that of Eq. I|2.35|) for the degree distribution 
of equilibrated graphs. As we will see, indeed, there is a close relation of the ZRP and equilibrated 
pseudographs. 

Now we shall study, how the behavior of u(m) influences on the emergence of condensation. From 
Eq. I|3.49p we see that for homogeneous system there is a correspondence between the hop rate u(m) and 
the function /(to): 

u(m) = f (to — 1) / / (to) /(to) = /(to — 1)/u(to), (3.59) 

and in many cases /(to) is more convenient, so we will stick to it below. Using the integral representation 
of Kronecker's delta we can rewrite the partition function as 

Z(N, M) = j> ^-z- M ~ l [F(z)f , (3.60) 

where F(z) is an infinite series with coefficients given by f(m): 

oo 

F(z) = f(m)z m . (3.61) 

m=0 

Denote the radius of convergence of this series by r (finite or infinite) . The partition function (|3.60p has 
the same form as the p.f. (|2.3ip for pseudographs, up to a factor depending only on N, L. Therefore, 
equilibrated pseudographs can be mapped onto a homogeneous system of balls in boxes with M = 2L. 
The degree distribution Il(fc) is then equivalent to the ball distribution tt(to). Therefore, many results 
presented below apply also to pseudographs from Sec. 2.1.4. In the thermodynamical limit, the integral 
(|3.60p can be rewritten as 

Z(N,M)& I ■^exp[-N{p]nz-\nF(z))], (3.62) 

J 27TI 

and can be done using the saddle-point method: 

^/2ttNG"(z ) z, 



Z, A 1 1 . * , J,,,,, /^r , (3.63) 
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where G(z) = —phiz + InF(z) and the saddle point zq is determined by the condition G'(zo) = 0, 
analogously to the case of pseudographs: 

P = Z °FlH- (3 - 64) 

The saddle point solution <|3.63|) holds as long as the Eq. <|3.64|) has a real solution for z . If not, the 
formula p. 63ft cannot be trusted and in fact, as we will see, it breaks down in the condensed state. First, 
let us consider a situation when Eq. (|3.63|) has a real solution for zq. In this case the leading term in the 
free energy InZ is given by G(zq), Differentiating this with respect to f(m) we obtain the distribution 
of balls: 

Tr(m) = f{m)pnj- (3-65) 

Hence, if z = 1, 7t(to) oc f(m) which explains the name "bare occupation function" for /(m). One can 
directly see from the definition that F(z) is an increasing function of z. Similarly one can see that the 
right-hand side of Eq. i|3.64p increases monotonically with zq as long as zq is smaller than the radius of 
convergence r. It means that zq increases when the density p increases. If the series (|3.6ip is convergent 
on the whole complex plane (r — - > oo), the saddle point solution for n(m) is valid for any density p. This 
happens only when /(to + l)//(m) — * for to — * oo, which corresponds to u(m) — + oo. In turn, this 
means that there exists an effective repulsive force between balls preventing them from occupying a single 
site. So in this case balls tend to distribute uniformly on the whole graph, regardless of the density of 
balls p. Consider now u(m) ~ m s for an arbitrary 5 > 0. Then /(to) is given by 

fH ex jlrg, (3.66) 
(to!) 

and it is clearly seen from Eq. I|3.65p that the distribution of balls falls faster than exponentially. 

On the other hand, it is possible to choose /(to) so that F{z) has a finite radius of convergence. This 
situation happens when u{m) tends to a constant for to — > oo. Because multiplying u(m) by a constant 
factor simply corresponds to rescaling the time axis, without loss of generality one can set u(m) — > 1 in 
the limit of large to. To be more specific, let us consider the case u(m) = 1 + for which we find the 
following formula for /(to) [100J: 

r(6+l)m! „ r(6+i) 

1 (6 + TO + 1) TO 

which falls like a power of to for large to. In this case the series F(z) has a finite radius r — 1. We 
must consider now two cases: b < 2 and b > 2. For b < 2 the derivative F'(z) goes to infinity when 
z approaches one from below. But the ratio F'(z)/F(z) is finite for z < 1 and thus the density from 
Eq. I|3.64p can be arbitrarily large. The saddle-point approximation works well: 

z m 

Tr(m) ~ (3.68) 

for all values of p. We see that in this case the distribution of balls falls off exponentially for large to. 
The case b > 2 is different because the ratio F 1 (z) / F{z) cannot grow above some critical p c : 

F'(l) T, m mf(m) 1 

at which Eq. (|3.65p cease to hold for real values of zq. But of course we can put as many balls in the 
system as we want, so we can increase p above p c . What happens then? At the critical point p = p c , the 
distribution of balls is given by 

, , /(to) r(6 + i) 

b (1) to 

in the thermodynamical limit. It has a finite-size cutoff for N < oo which scales as [67] 

~ jvV(6-i) for 2<6<3, (3.71) 
- N 1/2 for b > 3, (3.72) 

exactly like for the cutoff of the degree distribution for degenerated graphs. For p > p c the saddle-point 
equation is no longer valid. To understand what happens then, recall the Bose-Einstein condensation. 
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Below the critical temperature T c , the fraction of particles occupying all energy levels above the ground 
state is equal to (T/T c ) 3 / 2 and it is less than one. The only way to keep the average number of particles 
fixed is to let them go into the lowest energy level which does not contribute to the partition function, 
which in the thermodynamic limit is given by the integral over energy |101| . The situation is a bit similar 
to the condensation of balls. Above the critical density some nodes take the surplus of balls, while the 
rest follows the critical distribution l|3.70p . In |102| one can find a complete proof. Here we only recall 
the main arguments [94] standing behind it. Assume that the system is deeply in the condensed phase, 
so that M ^> Np c . Denote the excess of particles by A = M — p c N. The canonical weight of each 
configuration is 

± J {m 1 , . . .,m N ) = f(m 1 )...f(m N )~(j[m^J . (3.73) 

Let us estimate the contribution to the partition function from states where the surplus of balls occupy 
one, or two nodes. The contribution to Eq. (|3.73p from a single-site condensate is NA~ b . From two-node 
condensation we however have N(N — l)/2 x (A/2)~ 2b . Because A w M — pN, we get for these two 
states: 

N 1 - b p- b and {N - l)N 1 - 2b 2 2b - 1 p- 2 \ (3.74) 

respectively. The second expression disappears faster in the thermodynamical limit, so we infer that the 
condensate emerges on a single node taking A balls on the average. One can consider also higher-nodes 
states but they disappear even faster when N — * oo. The condensate is seen as a peak 7r CO nd(m — A) in the 
distribution of balls. Since it occupies only one node, the area under the peak is equal to l/N. Because 
in the remaining part of the system there are only p c N balls, the background of the distribution 7r(m) is 
perfectly described by the saddle-point solution (|3.70p . The complete expression for 7r(m) including the 
condensate reads 

7r(m) « r( ° b+ h ^ + ^ond (m - (M - p c N)) . (3.75) 

The form of 7r con( j(m) has been investigated in [67] in the model with continuous masses m,. In the model 
considered here one can take a quasi-continuous limit by letting M — » oo and rescaling u{m) properly. 
For 2 < b < 3, the peak is approximated by 

7w(z) = N- b '^V b (^rry) , (3.76) 

with Vb(z) given by the integral: 

/too J 
-ioo 2m 

which falls as |z|~ b for z — > — oo and faster than a Gaussian function for z 
b > 3 in the vicinity of m — A, the peak falls like a Gaussian: 

n cond (x) £* 1 exp(-x 2 /2a 2 N), (3.78) 
V 2tt(j z N J 

with a 2 = (m 2 ) — (m) 2 being the variance of f(m). In both cases, the area under 7r CO nd(™) is equal to 
l/N, 

The picture we see now for b > 2 is the following. Below the critical density the distribution of balls 
is given by a power law suppressed additionally by an exponential prefactor. At the critical point this 
prefactor vanishes and we observe a pure power law, disturbed only by finite-size effects. Above the 
critical point, the condensate emerges at one node chosen at random (spontaneous symmetry breaking) 
from all nodes. The condensate does not need to occupy this particular site for the whole time. In fact, 
we will see that it performs a kind of random walk through the system, but the process of melting and 
rebuilding the condensate is fast in comparison to the mean occupation time. The condensate contains 
M — p c N balls on average. 

At the end, let us mention the case u(m) — > for large m. In this case /(to) increases fast with m 
and the series F(z) has a zero radius of convergence. The critical density p c is zero and therefore the 
condensation takes place at any density p > 0. Balls attract so strong that almost all of them fall into a 
single node chosen at random. 

In Fig. 13. 71 we show n(m) for the three different types of the hop functions /(to) discussed above, for 
various densities p. The data are obtained by means of the recursion formula (|3.55|) for the partition 
function and, if possible, compared with the saddle-point solution <|3.65[) . 



(3.77) 

oo. On the other hand, for 
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Figure 3.7: Plots of the distribution of balls 7r(m) for a homogeneous system with various /(to), N, and 
p. Left: f(m) ~ 1/m!, TV = 40, for p = 0.5 (diamonds), 1 (squares) and 2 (circles), no condensation. 
Middle: /(tti) ~ m -3 , p c = 1 and N = 400. For p = 0.5 (the left-most line) there is an exponential 
cutoff. At the critical density p = p c the power-law behavior is seen. Above p c , the surplus of balls 
forms the condensate (the right-most line) . The curves plotted with symbols are obtained by recurrential 
calculation of Z(N,M) from Eq. I|3.55|l . Solid lines represent the saddle-point solution (|3.65[i . Right: 
f(m) ~ tti!, N — 40, for p — 0.5, 1, 2 (from left to right). The condensation is always present. 

Inhomogeneous systems. Now we focus on inhomogeneous networks, whose degrees vary from node 
to node. It turns out [HH [97] that the effect of inhomogeneity is so strong that it completely dominates 
over the dependence on the hop rate 7t(?7i) as long as the latter does not change too fast with to. For 
simplicity we can assume u{m) = 1 and concentrate only on the effects coming from the inhomogeneity 
of degrees. Then, the zero-range process describes a gas of M indistinguishable and non-interacting balls 
randomly walking on the given network. 

The most interesting case of graph with inhomogeneous degrees is of course the S-F network. It 
was studied in [97l I103| but because of its complicated structure, only very simple calculations of the 
static properties were possible. Here we decided to focus on the effect coming from the node with largest 
degree, say k\. To further simply considerations, we just imagine that the effect can be well simulated 
by assuming the identity of the remaining degrees. In effect we are led to consider an almost fc-regular 
graph which has one node of degree ki bigger than the others which are of degree k% = ■ ■ ■ = kjy = k 
|99| . We shall call it a single-inhomogeneity graph. 

To construct fc-regular graphs and the single-inhomogeneity graph one can use various methods. For 
instance, one can start from a random graph with given number of vertices and links and rewire it until 
all nodes will have desired degrees. Another method of building a fc-regular graph is to start from a 
complete graph with k + 1 nodes and build the desired graph successively adding nodes and links. The 
procedure depends on the parity of k. If k is odd, then the number of nodes has to be even because 
the number of links L = Nk/2 must be integer, otherwise the graph cannot be built. At each step we 
pick up k — 1 existing links and split them so that nodes being formerly joined by these links, have now 
"halves" of them. Then we introduce two new nodes i,j joined by a link. Finally, one half of "free ends" 
of previously split links is joined to the newly added node i, and the other half to j (Fig. 13.8a ). In this 
way every node has now k neighbors. We repeat this process until the total number of nodes is equal to 
N. In case of even k, the algorithm is similar, but we add only one new node per time step, and split fc/2 
existing links (Fig. 13.8b ). Sometimes, as a result of nodes' addition, multiple connections might arise. To 
prevent them, we discard such moves. For k much smaller than N, they are rare and the acceptance of 
the algorithm is almost 100%. We use the structure described in [53] to code the graphs. 

A single-inhomogeneity graph with one node having degree k\ > k can be then obtained from a 
fc-regular graph with TV — 1 nodes by adding to it a new node and joining to it k\ links coming from 
splitting fci/2 randomly chosen links existing previously. If we get multiple connections, we discard this 
move and try again. 

Before we calculate the partition function for a single-inhomogeneity graph it is convenient to calculate 
it for a fc-regular graph. Because we set 77(771) = 1 in this section, as mentioned before, hence also 
/(to) = 1. For a fc-regular graph the partition function Z(N, M) from Eq. (|3.52[) is 

M M . 

Z ies (N,M)= ]T ••■ S mi+ ... +mN , M k^i m *=k M —fdzz- M - 1 [F(z)} N , (3.79) 

mi— mjv— 
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Figure 3.8: The illustration of the method for generating ^-regular graphs, based on addition of new 
nodes: a) for odd k (here k = 3), b) for even k (here k = 4). 



where now F{z) reads 



Using the expansion: 



M 



(1-z) 



-N 



n*) = E 



m=0 v 



1 - z 



m=0 v 



(3.80) 



(3.81) 



and Cauchy's theorem which selects only the term with m = M, we finally get an exact expression for 
the partition function of a A:-regular graph: 



Z ieg (N, M)=k 



A I 



N + M-l 

M 



The partition function for a graph with one irregular degree k\ > k has the form: 

M M M 

Z inh (N,M)= E( fc i) mi E ••• E 8 M , mi +-+ mN k m2+ - +n 



(3.82) 



(3.83) 



mi— 



ni2— 



The sum over rrc.2, . . . ,mjv is just the partition function Z rcg (N — l,M — mi) from Eq. (|3.82[) . After 
changing the variable from mi to j = M — m\, the formula l|3.83p can be rewritten as 



A I 



Z inh (N,M) = kMJ2' 



3=0 



N + j-2 
j 



(3.84) 



where a — k/ki describes the level of "inhomogeneity". Introducing an auxiliary function 

1 



3=0 



-(N-l)\ 

j ) (1 -a)""*' 



we obtain the following expression: 

Z inh (N, M) = kf [(1 - a) 1 ^ - c(M)} , 
where c{M) gives a correction for finite M. It tends to zero for M — > oo: 



z{M)= £ 



j=M+l 



N + j-2 



(3.85) 



(3.86) 



(3.87) 



This correction can be however quite large for fei s» k because then a ~ 1 falls slowly and the binomial 
term increases with j. We can estimate the correction by the saddle-point method, replacing the sum by 
the integral: 

1 



z{M) 



(N-2)\J M 



(3.88) 
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Figure 3.9: The distribution of balls at the singular node for single-inhomogeneity graphs with k = 4, 
N = 20, fei = 8 (left) and ki = 16 (right). The total number of balls is M = 20,40 and 80 from left to 
right, respectively. Points represent numerical data while solid lines the solution from Eq. (|3.95|l . 



where we define a new function F(J): 

F(j) = j In a + ln((iV + j - 2)!) - ln(j!) 
and use Stirling's formula for factorials. We get: 



(3.89) 



/ 

JM 



Ml) 



dj « e 



erfc 



I M JM V 2F"(j*) 

where erfc(x) denotes the complementary error function: 

2 f°° _ 2 
erfc(x) = —= / e y dy, 
V 71 " Jx 

and is the point being a solution of the saddle-point equation F' (j* 

a(N-2) 



((M-J*)V-F"(j*)) , (3.90) 



J* 



1 - a 



Only leading terms in F(j) were taken into account. Collecting all together one finds 



c(M) = 



a -^r~ ((JV-2)/(l-a))! _/7ra(iV-2) 1 
1-a (a(iV-2)/(l-a))!' 



(JV-2)! 



erfc 



M(l - a) - a(7V - 2) 
V«(JV-2) / 



(3.91) 



(3.92) 



(3.93) 



The complete partition function Z-^iN ', M) is given by Eq. (|3.86[) with c(M) calculated by means of 
Eq. (|3.03|) . We can now calculate tti (m) that is the distribution of balls at the node with degree k\ : 



U ' Z inh (N,M) 1 



(3.94) 



where Z rcg (N, M) is the partition function for a fc-regular graph. Making use of the formulas ([3. 82ft and 
(f3T86|) we get 

M _ m fM + N -m-2\ ...... 

M -m 



7Ti (m) = a 71 



2 ) [(l-a) 1 ^-^)]- 



(3.95) 



In figure [3]9] we show a comparison between the analytic solution (|3 .95(1 and the one obtained from Monte 
Carlo simulations. Neglecting inessential normalization, the equation l|3.95p has the following asymptotic 
behavior: 

7ri(m) - exp(G(m)), (3.96) 



where 



G(m) = -mlna + ( M + N - m - | j ln(M + JV - m - 2) - yM - m + - ) ln(M - m). (3.97) 
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Condensation takes place when G(m) has a maximum for positive m = to, Taking the derivative of 
Eq. I|3.97p and neglecting terms of order 1/M 2 in the corresponding equation G"(to*) =0we find 



M-- (JV-2). (3.98) 



Let us calculate the mean number of balls at the first node: 

M M 

{mi) = *l(m)m = M - ^ Jn(M ~ (3.99) 

rn— j— 

In the large M limit, the sum over j can be calculated exactly: 

M - ^^fJn? = M - a**m = M - — ^ — (N - 1) « to, (3.100) 

The condensation occurs when an extensive number of balls is on the singular node. This happens when 
to* > or equivalently when (toi) > 0. Therefore, the critical density in the limit N, M — > oo with fixed 
density p — M/N is 

Pc = j^—. (3.101) 

1 — a 

The condensation is possible only for the density p > p c , exactly like in the Single Defect Site model 
[94] . The critical density decreases with decreasing ratio a — k/k\ or, equivalently, with increasing 
inhomogeneity ki/k. The site, which contains the condensate, has N(p — p c ) + p c balls on average, as 
follows from Eq. I|3.100p . It is also easy to find that the distribution of balls 7i"i (to) at any regular node 
with degree k falls exponentially 

/ k \ m 

m(m) oc(— j =a m , (3.102) 

with a < 1. Thus the condensation never appears on a regular node. When the system is in the condensed 
phase, the mean occupation of such a node is equal to p c independently on the total number of balls in 
the system. 

Let us consider also a special example of a single-inhomogeneity graph called a star graph, which has 
one node of degree ki = N — 1 and N — 1 nodes of degree k = 1. Because k\ increases when the system 
grows, the parameter a goes to zero as X/N. The critical density p c — > in the thermodynamical limit. 
Thus on a star graph the condensation appears for any finite density p > 0. We can calculate the variance 
of toi which is a measure of fluctuations. Introducing p = — In a we have 

((TO! - (TOx)) 2 ) = L , (3.103) 

and inserting p = hi(N — 1) for the star graph we get (^jfz^J which tends to one when N — > oo. 

Therefore, for almost all time the condensate has all balls but one, the mean value (mi) « M — 1 as it 
follows from Eq. (|3.100p . and fluctuations are small. The occupation of other sites must be thus close to 



3.2.3 Dynamics of the condensate 

One can address two natural questions while studying the dynamics of the condensate: i) how is it pro- 
duced from a uniform distribution of balls, and, ii) how much time does it take to melt the condensate and 
rebuilt it at another site? The answer to these questions is different for homogeneous and inhomogeneous 
systems. Moreover, in both cases the dynamics depends on the structure of network, not only on degrees. 
For instance, one can imagine that there is a bottleneck, e.g. a single link joining two larger parts of 
the network. The transport of balls on such graph will be different from the case when these two parts 
are strongly interconnected. As we will see, however, the structure of the network is not so important as 
one could think, and characteristic time scales are determined mainly by the size of the system and its 
(in) homogeneity. 

The emergence of the condensate from a state where all nodes have approximately equal occupation 
numbers has been investigated for homogeneous systems [94], |100| . |102j . The process can be divided 
into two stages. First, the surplus of balls A accumulates at a finite number of nodes. When this happens, 
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Figure 3.10: Time scales for coarsening dynamics of the condensate on various homogeneous networks. In 
numerical simulations we measure the average time (r) after which the maximal number of balls exceeds 
A = M — p c N at some node. The system starts from a uniform distribution of balls and the simulation is 
stopped when m„ max > A. The time t is collected and the procedure is repeated. At the end we calculate 
(t). Repeating this a few times we estimate errors. Circles: complete graph (D > 2), squares: 4-regular 
random graph (D > 2), diamonds: 2d periodic lattice (D — 2), triangles: one-dimensional closed chain 
(D = 1). Lines are asymptotic solutions from Eq. (|3. 104(1 with proportionality coefficient fitted to data. 
In all cases M = 4N and 6 = 4, hence the density p^$> p c = 1/2. 

small condensates exchange particles through the nearly-uniform background. This results in coarsening 
of many condensates which eventually form a single one with a larger number of balls. This process is 
very slow. Assuming the jumping rate in the form u(m) = 1 + b/m and that we are in the condensed phase 
p > p c , the mean condensate size grows as A(t/r) s , where the characteristic time scale t for coarsening 
dynamics has been estimated as 

( N 3 for D = 1, 

N 2 In N for D = 2, (3.104) 
[ iV 2 for D > 2, 

and the exponent 5 is inversely proportional to the power of N in the expressions above . Here D is the 
dimension of the network, e. g. D = 1 for a closed chain, D = 2 for a two dimensional lattice and D = oo 
for a complete graph. In figure [37X01 we present the comparison of these asymptotic formulas to computer 
simulations. 

Contrary to the coarsening dynamics, studies on the dynamics of an existing condensate in homoge- 
neous systems are rare [100]. One can ask what is the typical life-time of the condensate, that is how 
much time it takes before it disappears from one site and rebuilds at another site. Let n max be the 
position of the node with maximal number of balls. In figure I3TTT1 we plot n max as a function of time, for 
different densities p, for a regular graph. It is clearly seen that the characteristic time between jumps is 
much larger in the condensed phase. This means that the condensate, once formed, spends a lot of time 
without any move and then suddenly jumps to another node. In |100| authors investigated this process 
on a complete graph. Using a Markovian ansatz that the number of balls on the condensed site varies 
slowly in comparison to other m^'s one can recast the original problem into a biased diffusive motion on a 
one-dimensional interval. The authors showed that average crossing time, i. e. the time between melting 
the condensate and rebuilding it at another site, can be approximated by the formula 

f oc (p - Pc ) b+1 N\ (3.105) 

for b > 1. Thus for fixed size of graph and far above p c , T grows like a (b + l)th power of the density of 
balls. For the fixed density, T grows with N as ~ N b . This formula holds only for quite large systems 
and therefore it is hard to verify in Monte Carlo simulations. In figure l3~T2l we see that for small systems 
the power-law dependence is rather on M than on (p ~ p c ) as it would stem from Eq. (|3. 105|) . 

Let us discuss now inhomogeneous systems. Although the emergence of the condensate in zero- 
range processes has been extensively studied, not much is known about their dynamics. The coarsening 
dynamics has been examined numerically for scale-free networks in Ref. [97], where the jumping rate was 
taken to be u(m) oc m s with 8 > 0. It was observed that the dynamics is hierarchical. First, balls on 
the sub-network of small degrees are equilibrated, then nodes with higher degrees are equilibrated, and 
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Figure 3.11: From left to the right: position of the node n max with maximal number of balls as a function 
of time, for different densities p = 0.5 (below p c = 1), p = 1.5 and far in the condensed phase p — 2.5. 
The network is a 4-regular graph with N = 20 nodes and the hop rate u(m) oc 1 + 3/m. In the condensed 
phase, the condensate occupies a single site for a long time and then moves to another site, thus jumps 
in n max occur seldom. 



finally the hubs - the nodes with highest degrees. The global stationary state is reached with the time 
t ~ TV 2 , where 

1 + a - S for trees, . . 

1—5 for network with loops, 

and a is the exponent from the cutoff scaling law k max ~ N a . Below we shall discuss the dynamics of the 
condensate, once it is formed. This issue has not been studied yet. Although we study only simplified 
models, the results will allow us to derive some conclusions about how this process looks like on S-F 
networks. 

We shall consider the dynamics on a single-inhomogeneity network introduced earlier. It is a very 
good candidate to examine how inhomogeneities influence the typical life-time of the condensate. In order 
to determine this typical time T after which the condensate melts, we should first define this quantity 
properly. We have seen in the figure l3T2l that for small homogeneous systems it was impossible to reach a 
good agreement with theoretical predictions. One of the reasons might be that, in fact, the crossing time 
|100j has not much to do with jumps in the position n max . On the other hand, the approach presented 
there seems to work not only for homogeneous systems so we hope to successfully apply it to our case. It 
is therefore convenient to consider the quantity T mn - the average time it takes to fall from m to n balls 
at the condensed site, or more precisely, the first passage time from the state with to balls to the state 
with n balls at that site. T mn can be easily estimated from computer simulations - one starts to count 
the time when m\ passes through to, and stops when it passes through n for the first time. Repeating 
this many times one gets the average time. 

This quantity can be combined with a typical life-time T using the following picture: in the condensed 
phase, the node with maximal degree takes an extensive number of balls A while for the remaining nodes 
TOi's fluctuate around the average number p c <C A. We suppose these fluctuations to be much faster 
than the life-time of the condensate. The condensate disappears when toi s» p c . Thus T w T mn where 
to ps A, n ss p c . The averaging should actually be done over all possible value of to, n with the appropriate 
weight. How to choose this weight and how to finally calculate T will be explained later. Now we would 
like to focus on the time T mn for given to and n. 

To calculate T mn we adopt the method presented in Ref. |100| . We assume that the condensate can be 
considered as slowly changing in comparison to fast fluctuations in the bulk. Suppose that at particular 
time, the condensate has to balls. After one time step, the condensate can loose one ball, gain one ball 
or remain intact. Let us denote by A m the probability that to — > to + 1 and by p m that to — > m — 1. 
Assume additionally that po = 0, Am = and fx m >x, \ m <M are greater than zero. One can see that T mn 
has to fulfill the following equation: 

Tmn — 1 ~l~ ^jnTm+l ,n ~t~ pm^m— X,n ~t~ (1 /^m)?mn, (3.107) 

with T nn = 0. Defining d m = T mn — T m _i „ we rewrite that equation in the form: 

dm.P'm — ^mdm+l = 1. (3.108) 

With zero on the right-hand side it would have a solution d m — Ilfe^i 1 Mfe/Afc- In general, the solution 
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Figure 3.12: Average periods of time between jumps in n max versus the density p, for homogeneous 
graphs. The simulation has been performed on various graphs with N — 20 nodes and the jumping rate 
u(m) = 1 + 3/m, the critical density p c = 1. Circles: 4-regular graph, squares: complete graph, diamonds: 
one-dimensional closed chain. Solid lines are power laws aoM ai fitted to data; for various graphs ai = 
3.6 . . .4.2. The dashed curve giving the asymptotic formula (|3. 105[i with arbitrary proportionality factor 
is far away from numerics. 



has the form d m — c rn Y\™ = i p-k/^k with c m > 0. The equation (|3.108p gives the recursion for c m : 



^ m — 1 

i = — Y\. ^fc/^fe- 



(3.109) 



k=l 



The maximal number of balls at one node is M, so it must be cm+i = 0. With this boundary condition, 
Eq. (|3.109[) has the solution: 

M l-l , 

^ - e v n v- ( 3 - 110 ) 



l—m k—1 



This leads to the following expression for T mn : 



rn m /p-l \ ( M 1 l-l , \ 

p= „+i p=„+i \ fc= i Afe / y i=p w ,=i ^ y 



(3.111) 



In our case p m = u(m) since it gives the probability that m — > m — 1 at ith node. To find A m , let us 
consider a Master equation for the distribution of balls 7Tj(m) at site i: 



d t iTi(m) = 7Tj(m + l)/x m +i + 7Tj(m - l)A m _ a + 7Tj(m)(l - fi m - A m ). 
In the stationary state the derivative vanishes and hence 

7r,(m + l)/i m +i - Ki{m)\ m = Hi{m)[i m - 7Ti(m - l)A m _i = const. 



(3.112) 



(3.113) 



One sees that expressions on both sides of the last equation cannot depend on m. Inserting m — 1 we 
see the constant is equal to zero. From Eq. I|3.113p we obtain A m : 



Am 



Tii(m + 1) 



(3.114) 



Inserting p m and A m to Eq. (|3. 1 1 1ft . after some manipulations we obtain 



M 



p=n+l KF > tyr > l=p 



(3.115) 



We now apply the Eq. (|3. 115ft to the case of a single-inhomogeneity graph assuming as before u{p) = 1. 
Let us start with the star graph as a special degenerate case and calculate the average transition times 
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Tmn for the central node on which the condensate spends almost all time. Using the formula <|3.95|) with 
a = l/(N — 1) for the star graph we have: 



lmn J^.^ ' (M + N-p-2)\(M-iy/ (6 - nb) 

The terms vanish for p — > m if to ^> 1. The sum over I decreases slowly with p because it is a cumulative 
distribution for tti(1) (see Eq. (|3.115|) ) . Thus for large to the transition time is almost independent of 
to. This means that the condensate fluctuates very quickly around some value 1 <C (toi) < M. We 
know from our previous considerations that (mi) M and fluctuations are very small, so it is enough 
to concentrate on Tmu- Changing variables we get 



M-n-l 



(3.117) 



T Mn = (N-2y. y - — - — ^(n- iy Y(n - iy k 2 + fe , )! . 

V ' ^ (N-2 + r)l K ' ^ V ; kUN -2)1 

r=0 y 1 k=0 y ' 

In the last sum we can set the upper limit to infinity. We have: 

TMn « (— 2 ) (N - 2)1 g w -^ Tl (N - iy. (3.118) 
The sum over r can be done approximately by changing the variable r — > M — n — 1 — r: 

T Mn M (^1) N (N - 2)!(iV - 1)--^ M-n-l (M-n-l)l 

\N-2J y ,y ' M -n-2(M + N -n-3)! V ' 

We see that, because of the factor (N — 1)~", the time Tmu is very sensitive to n. In figure I3TT31 it is 
compared to computer simulations. This complicated formula has a simple behavior in the limit of large 
systems and n = 0. For M — > oo and N being fixed we get an exponential growth: 

T M o~ (N-l) M , (3.120) 

while for fixed density p = M/N and N — > oo it increases faster than exponentially: 

T M0 ~ e P NlnN . (3.121) 

The approximated expressions p,120p and (|3.12ip can be obtained very easily using a kind of Arrhenius 



law jlOOL I104j . which tells that the average life-time is inversely proportional to the minimal value of the 
balls distribution: 

T M0 - l/< lin , (3.122) 

if one thinks about the condensate's melting as of tunneling through a potential barrier. In our case the 
barrier 1/7Ti(to) grows monotonically with to — > so we observe that the condensate bounces from the 
wall at to = rather than tunnels through it. We have 7r™ ln ~ (N — 1)~ M for fixed N and large M 
and thus we get Eq. (|3. 120[) , while for fixed density p the distribution 7r™ m falls over-exponentially which 
results in Eq. (|3. 121)1 , 

Before we comment on the exponential behavior of times T mn , let us calculate analogous quantities 
for the general single-inhomogeneity graph. In the condensed state the occupation mi fluctuates quickly 
around the mean condensate size (mi) = A, with the variance estimated by Eq. l|3.103j) as ~ N. Although 
A is smaller than M we can assume that T mn fa Turn for to > A. This is so because the transition time 
between the states with high number of balls to the state with to w A must be small since the potential 
1/7Ti(to) decreases with to — » A. Therefore, we can concentrate again on Tmu which is easier to compute. 
From Eqs. lpT95|) and (|3.115[1 we have: 

M M 



Changing variables we get: 



p— n+l l—p 



™ (M-p)l y (M + JV-p-g-2)! 

= j£ x (M + N- P -2y. g a (M- P - q y. ■ ( 3124 ) 
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Figure 3.13: The average life-time Tmq for a star graph with N = 10 calculated from Eq. (|3. 119ft (left 
panel) and for a single-inhomogeneity graph with N = 20, k = 4, ki = 16 from Eq. (|3. 127(1 (right panel), 
compared to computer simulations (circles). 



The sum over q can be approximated in the condensed state by an integral and evaluated by the saddle- 
point method. The saddle point qo = a(N — 2)/(l — a) is equal to to* from Eq. (|3.92|) and therefore all 
calculations are almost identical. In this way we get: 



,^2_ M ((AT-2)/(l-a))! 2na(N - 2) 
(a(N-2)/(l-a))\V (1 - af ' 



(3.125) 



The only dependence on p is through the factor a p . To calculate Tmu it is therefore sufficient to evaluate 
the sum: 



M 

£ ■ 

>=n+l 



(M-p)\ 



(M + iV-p-2)!' 



(3.126) 



Because every term is proportional to 1/Vt (p) from Eq. Q3.95P , in the condensed state the function under 
the sum has a minimum at the saddle point po w to* S (1, M). This means that the effective contribution 
to the sum can be split into two terms: one for small p and one for p « M. The "small-p" part can be 
evaluated like for static distributions in the previous section. To calculate the "large-p" part, it is sufficient 
to take the last two terms, namely for p — M and p = M — 1, since the large p terms decrease quickly 
with p. The complete formula for Tmu is given by: 



Tmu 



. a ^l-M ((N - 2)/(l - a))! 27ra(iV - 2) 



a 



Ml 



(a(J\T-2)/(l-a))!l 
M + N- 2 X '" 1 1 



(M + AT - 2)! 



M 



(l-a) a 

M + N - 2 



1 - a- 



M 



a M - 1 (q(JV-l) + l) 
(ATTI)! 



(3.127) 



In Fig. l3.13l we compare the theoretical expression for Tmo with Monte Carlo simulations. The agreement 
is the better, the larger M is. In the limit of large number of balls, M — > oo, while keeping and a 
fixed, the life-time grows exponentially: 



T, 



MO 



M 



In the limit of fixed density p = M/N > p c and for M, N — > oo: 

T M0 - exp [A 7 (- ln(l - a) +p\n{p/a) - (1 + p) ln(l + p))] , 



(3.128) 



(3.129) 



Let us now comment on the exponential times observed for inhomogeneous graphs. Unlike in homogeneous 
systems, where the life time grows like a power of M, in the presence of inhomogeneity it changes to 
the exponential behavior. This is typical for systems possessing a characteristic scale. Here it is given 
by the ratio k\/k. This situation is to some extent similar to the relation between massless and massive 
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interactions in particle physics. A two-point function for a massless field has a power-law decay and thus 
an infinite range, while for a massive field it falls off exponentially with the distance. 

Tmo calculated above gives us some insight into the dynamics of the condensate. Now we try to find 
a formula for T that is for the average time between consecutive jumps of the position n max of the node 
with maximal number of balls. We do not expect that the behavior of T will be asymptotically different 
from T mn but we would like to check if we understand well the process of melting and rebuilding the 
condensate. 

The main contribution to the average life-time of the condensate comes from situations when it 
occupies the node with the highest degree. The probability that at a particular time step the condensate 
has m balls is given by 7r 1 (m). The condensate ends its life at a certain m 1 = n number of balls which is 
no bigger than at the remaining nodes, which means that there is at least one node i ^ 1 with the same 
or higher occupation: m, > mi. We call this an "event A" and denote the probability of its occurrence 
by a(n). In order to calculate the average time T we have to sum over all possible m,n, weighted by 
appropriate probabilities: 



T 



1 



M M 



2 , / „ 7ri(m)r mra q(n), 



(3.130) 



where the first factor gives an appropriate normalization of o(n). We have already calculated 7Ti(m) and 
T m n for the single-inhomogeneity graph. What remains is to calculate a(n). We have to consider the 
subset of all configurations {mi, . . . , tun} which favor the event A. One step before A happens, we have 
n+ 1 balls at the first node and no more than n balls at other nodes. The number of balls is M, therefore 
at nodes 2, . . . , TV there is M — n — 1 balls in total. New configurations which lead to A are the following: 

1. n balls at the 1st node, n + 1 balls at one node among N — 1 remaining nodes, and rrii < n balls 
on each of N — 2 remaining nodes, that is M — 2n — 1 in total on N — 2 nodes, 

2. n balls at the 1st node, n balls at one node among N — 1 nodes and rrii < n at each of remaining 
nodes, with the total number of balls M — 2n on those nodes. 



The probability of the event A is proportional to the sum of all configurations described above: 

1 



a(n) = 



Z(N,M) 



(N-l) 



m,3— mjv— 



+m 3 H Vm N 



M-r. 



(3.131) 



First two factors give the normalization which corresponds to the sum over all configurations and over 
N — 1 possibilities of choosing the node having exactly n + 1 or n balls. In the square bracket we have 
numbers of configurations with fixed amount of balls on nodes i = 3, . . . , N. The last two terms arise from 
degrees of nodes: the weight fc" for the first and k M ~ n for the rest of nodes. Skipping the multiplicative 
factor and denoting the quantity in square brackets by b(n) we have 



a(n) oc b(n)a 

The coefficient b(n) can be expressed through the following sum: 



(3.132) 



N-2 



6(n) = 53(-l) 



N - 2 
r 



N + M-4-2n- r(n + 1) 
M - 1 - 2n-r(n + 1) 



N + M—3 — 2n — r(n + 1) 
M — 2n— r{n + 1) 



(3.133) 



This formula is obtained by using the integral representation of discrete deltas in Eq. (|3.13ip and by 
calculating each sum over rrii separately. 

We could now write in principle the formula for T. It would be quite complicated so we decided not to 
present it here, but it can be evaluated numerically using the theoretical formulas given above. In figure 
13.141 we present the comparison between average life-times computed from simulations and calculated 
analytically. One clearly sees that while M increases, the points approach the theoretical curve but are 
systematically slightly above it. This means that f is a bit larger that predicted by Eq. (|3. 130[) . 



60 



le+02 - 



le+06 



le+04 




le+OO 



20 



30 



40 



50 



60 



70 



M 



Figure 3.14: Average life-time of the condensate on a small- inhomogeneity graph with N = 20, k = 4 
and ki = 8 (circles), ki = 12 (squares). Empty symbols: numerical estimations, solid lines: analytical 
formula (|3.130p . Error bars are of symbols size. 

3.2.4 Power-law distribution in the ZRP on inhomogeneous graphs 

In previous sections we have noticed that there are many differences between zero-range processes on 
homogeneous and inhomogeneous networks. For a homogeneous system, one can have a scale-free dis- 
tribution of balls 7r(m) if one tunes the hop rate u(m) appropriately, while for an inhomogeneous one, 
distributions are in general exponential. In inhomogeneous systems the condensate, if exists, resides at 
the node with the largest degree for almost all time, and even if it melts, it rebuilds very fast on this node, 
while in homogeneous systems the condensate moves from node to node. For inhomogeneous systems 
the typical time-scale for melting the condensate grows exponentially with N while in homogeneous ones 
only as a power of N. 

The power-law distributions at criticality are very interesting since they are typical for systems without 
a characteristic scale. Inhomogeneous systems have usually a typical scale introduced by the fact, that 
the flows of balls are different between different nodes. We want to address the question whether it 
is possible to obtain a power-law distribution of balls occupation numbers at the critical point also for 
inhomogeneous networks. As we shall see, the answer to this question is in the affirmative, but it requires 
a fine-tuning of the node-degree distribution of the underlying network. In this section we shall show 
how to do this. We shall also discuss some well-known examples of graphs including Erdos-Renyi graphs, 
for which the averaging over the ensemble of graphs leads to the particle distribution which resembles a 
power-law, although it is only a finite-size effect. 

Unlike in previous sections, where we were interested in properties of the ZRP on a given, fixed 
network, we consider now an ensemble of random graphs from Chapter 2. The graphs are defined by 
specifying a desired degree distribution II(fe) in the thermodynamical limit. They can be generated by 
the Monte Carlo algorithm described in Sec. 2.1.5. In this case, the probability P(ki, . . . , /cat) = P(k) of 
having a network with a sequence of degrees k\, . . . , fcjv factorizes for N — > oo: 



This assumption means that we consider only uncorrelated networks. It is approximately fulfilled for 
finite-size graphs if II(fe) falls quickly with k, as it results from the equivalence between canonical and 
grand-canonical partition function for networks [18] discussed previously, since in the grand-canonical 
ensemble there is no constraint on the sum of degrees. The factorization breaks down for scale-free 
networks. Particularly strong deviations from the factorization are observed for H(k) ~ fc~ 7 with 2 < 
7 < 3, where finite-size effects are especially strong [65]. Below we shall discuss networks which are free 
of these problems and for which the factorization works fine. 

Let us recall the partition function for the ZRP on a given graph: 



We are now interested in the behavior of the ZRP on a "typical" network, taken from the ensemble of 
graphs with distribution of degrees given by Eq. (|3.134p . We want to average over all possible degree 



p(k) = n(k 1 )---u(k N ). 



(3.134) 




(3.135) 
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sequences in the given ensemble. We thus define a canonical partition function: 

Z(N,L,M)= P(k)Z(N,M,k), (3.136) 

where L is the total number of edges in the graph which, as in the canonical partition function for 
networks, is assumed to be fixed. The dependence on L we pull into the probability P(k). 

To simplify calculations, we set u{m) — 1 as before, since we expect the effect of network inhomogeneity 
to be stronger than the effect coming from the dependence of u(m) on to. The canonical partition function 
(|3. 136)1 assumes now the form: 

N 

°^2 i m.i,M IImK), (3.137) 

fa i—1 

where fi(m) is the mth moment of the degree distribution H(k): 

oo 

H{m) = ^n(fc)fc m (3.138) 
fc=i 

This partition function has exactly the form of the partition function (|3-56|) for homogeneous ZRPs, 
which we have discussed before. This shows that averaging over uncorrelated networks is equivalent 
to considering another ZRP, for a homogeneous system. The averaging smears the inhomogeneity and 
restores the symmetry with respect to the permutation of occupation numbers. Instead of fi(m), distinct 
for different nodes, we have only one /(m) = fi(m), identical for all nodes. Moreover, we will see that if 
all the moments fi{m) exist, the system is self-averaging in the sense that for large N a single network 
taken from the given ensemble reproduces the limiting distribution of balls. 

After these preliminaries we are now ready to attack the main question, namely how to choose n(fc) 
in order to obtain a scale-free distribution of balls occupation numbers: 7r(m) ~ m~ b , at the critical 
point. From section 3.2.2 and Eq. (|3.137p we see that fi(m) should behave as mT h for large m. Thus we 
are looking for the degree distribution U.(k) which gives the following moments ([3.138P : 

, . Tim + 1) , 
MM = r(n U 1+ V m - (3-139) 

This particular form of /i(m) is well suited for analytical calculations, but of course we expect a similar 
behavior for any other /i(m) having the asymptotic behavior ~ m~ b . The exponential term cj) m does not 
change the 7r(m) at the critical point p c , since for conserved number of balls it gives only an overall factor 
4> M in Z(N, L, M). We shall use the freedom of choice of 4> to adjust the mean value of the distribution 
II(fc) to the average degree k — 2L/N which is fixed in the ensemble with given N,L. Introducing a 
generating function 

i z m _ ^ (zq 

... - r(m + 1 + b) ' 

-0 m— v ' 

we can recover II(Ar) for k > as a Fourier coefficient by means of the inverse transform: 



M(z) = £ n{m) Z — = £ - - (3.140) 

^— ' to! ' 1 TO + 1 + 0) 



Xl{k)=N— ( dze lzk M(-iz), (3.141) 

27r J-7T 

where J\f gives appropriate normalization, since M(—iz) is by definition equal to ^ fc n(/c) exp(— iz/c). 
The integral in Eq. (|3.14ip is in general hard to calculate and express through special functions like sine 
and cosine integral. However, the function M{—iz) falls to zero with z — * ±00 sufficiently fast and thus 
for <f) ^> 1 we can extend the limits of integration to ±00. Then equation (|3.14ip becomes a Fourier 
Transform of the function M{—iz) from Eq. (|3.20|) . introduced in section 3.1.2. Hence we know that it 
can be written as an infinite series expansion (|3.22|) (see also [62]). This complicated expression simplifies 
very much in the present case. Changing variables k — > x<j> we have 

n(^) = — / dze tzx Y / } - . (3.142) 
1 V> 27r0i_ oo ^ Q r(m+l + 6) V ' 

and we can now apply the results of section 3.1.2. According to Eq. (|3.22|) . the last integral gives 

6 ^ 

T fc=0 



k\T{b- k)' 



(3.143) 
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Figure 3.15: n(fc) calculated from the exact formula (|3.14ip (points) and approximated one (|3. 144[) (thick 
line), for J\f — 2ir and 6 = 3. Squares: tf> = 6, circles: 4> — 30. The approximate solution diverges for 
x > 1 and has to be cut. For < x < 1 the approximation is the better, the larger is 4>. 
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Figure 3.16: Desired (dotted line) versus real //(m) for finite networks (solid lines) obtained from 
Eqs. dgmH and ([3. 138ft . for 6 = 3. Lines from left to right: = 5,10,20,40,80, which corresponds 
to k — 1.67, 2.89, 5.4, 10.4, 20.4 from Eq. (|3. 146[) . These plots approximate also ir(m) at the critical point, 
for infinite networks. The parameter <j> grows almost linearly with k. 



and hence 

In figure 13.151 we show II(fc) calculated from the exact equation 1|3.141[1 and from the approximated 
one (|3. 144[) . Because the probabilistic interpretation of H(k) requires that it must be non-negative, the 
solution (|3. 144|) is physical only for k < <p. So we have to set II(fc) = above (p. Thus [(f>\ can be 
interpreted as the maximal degree existing in the network. One must be, however, aware that even the 
true II(fc) calculated directly from Eq. I|3.141j) can be negative above 4> and that cutting the negative part 
leads to some discrepancy between the desired /i(m) from Eq. 1|3.139|) and that obtained when (|3. 144[) is 
inserted into Eq. (|3. 138(1 . Hence we must make sure if we really have 7r(m) ~ m~ b at the critical point. 
The answer is that for any finite network we can always choose 4> so that the discrepancy between the 
power-law behavior and the real n(m) becomes negligible. In figure [HUB] we plot the desired function /i(m) 
and we compare it to that calculated for various <j> from Eqs. (|3.138p and (|3. 144[) . As 4> increases, the plots 
tend to the power law. We see that in order to get the correct ball distribution in the thermodynamic 
limit we have to scale k to have limAr^oo k = oo. Such networks with II(fc) given by Eq. (|3. 144)1 are 
neither sparse nor very dense since k scales with a power of N less than one. 

The parameter (f> is related to the average degree by the formula: k — Ylt=i H(fc)fc. The normalization 
Af must be chosen so that J2t=i = 1. The sum goes from one because there can be no isolated 
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nodes (k = 0) on the graph. When b = 2,3,4, one is able to find closed formulas for the normalized 
degree distribution II(fc). For instance, for b — 3 we have 

U{k) = -IV (3145) 

for < k < 4> and zero for k = and k > 0, with <f> given by the following expression: 



-l + Ak+y/l- 16k + 16k 2 ) /2. (3.146) 
In general, for large <f>, the relation between <f> and k is almost linear: 

- k Sti(fc - ^-'k „ J> - fc)fc"- 1 dfc _ 

Eti^-^)"- 1 tfkO-'dk b+V 

Because <f> grows with k, one should take graphs large enough to minimize finite-size corrections. In other 
words, the ratio <j>/N should be much smaller that 1. 

We performed Monte Carlo simulations of the ZRP on random networks with the degree distribution 
(|3. 144)1 to check whether one indeed obtains a power law in the distribution of balls 7r(m). The simulations 
were made as follows. First we generated a connected graph from the ensemble of random graphs with 
the degree distribution from Eq. (|3. 144[) . using the Monte Carlo algorithm described in previous chapter. 
The graph had no degree-degree correlations, except of those introduced by finite-size effects. On that 
graph we simulated the zero-range process starting from a uniform distribution of balls. We collected 
a histogram of 7r(m) and repeated the simulation for other networks from the ensemble. In total, we 
generated over 50 networks for the given set of parameters N, M, L. 

The crucial point is to ensure that those graphs are connected. The Monte Carlo algorithm presented 
before generates in principle graphs which may have disconnected parts. But we know (see e.g. [I]) 
that for random graphs there exists a critical average degree k c (a percolation threshold), above which a 
single component is always formed in the limit of N — > oo. In our simulations we always checked that 
we were above k c and that the graph we used was connected. We also simulated tree graphs which are 
by definition connected. For trees, however, finite-size effects are stronger than for graphs. 

In figure 13.171 we compare a theoretical distribution at the critical point p c = 1 for 6 = 3, with 
experimental ones obtained by numerical simulations. The agreement is not perfect. Finite-size effects 
are strong. But we see an apparent power law in the distribution of balls. The curves shown in figure 
13.161 would suggest that for k = 8, the power law should continue up to m of order 100. In figure l3~T7l 
we see that it deviates already before, probably because it is not exactly at the critical point. Indeed, we 
observe a reminiscence of the condensation suggesting that the system is already off the transition point. 
These deviations are induced by the fact that for any finite graph there is a condensation on the most 
inhomogeneous node [99]. This gives a peak in n(m). Because the condensate takes an extensive number 
of balls, it effectively increases the critical density for the rest of the system, so we are slightly below p c . 

In figure 13.181 we present results for large networks, but without averaging over the ensemble. We 
again get a power law which indicates that a self-averaging takes place. As before, the experimental line 
does not agree ideally with the theoretical one, 7r(m), but the power-law behavior is clear. 

The argumentation presented above suggests that one has to fine-tune the degree distribution in 
order to obtain the scale-free distribution of the number of balls. What happens if one takes different 
distributions? We have performed the ZRP also on some other networks and surprisingly found that 
7r(m) seems to be also heavy-tailed. In figure [3491 we show results of numerical experiments for random 
trees [37] and ER random graphs. Random trees are equilibrated trees with weights p(k) = 1 and have 
been already mentioned in Sec. 2.1.4. They can be generated using the Monte Carlo algorithm given in 
Sec. 2.1.5. The degree distribution for random trees reads 

U{k) = (fc~-T)T' ( 3 - 148 ) 



for k > and 11(0) = 0. For ER graphs, II(fc) is approximately Poissonian as we know from Eq. (|2.5p . 
In figure 13.191 we see the results of measuring n(m) on networks of size of order few hundreds. The 
distribution 7r(m) seems to follow a power law, in a certain range. In order to check whether this range 
increases in the large N limit, one would have to perform a systematic analysis for networks of growing 
sizes. 

However, here we prefer to present some theoretical discussion of whether it can be a power-law 
or rather some other distribution. Let us calculate the theoretical distribution n(m) = /i(m) at the 
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Figure 3.17: The distribution of balls in the ensemble of graphs with degree distribution H(k) <~ {<f> — fc) 2 . 
Solid line: theoretical ir(m) <~ to~ 3 at the critical point p c = 1 and for infinite system. Circles: for trees 
with N — M — 400, averaged over 50 graphs, <j> « 6. Squares: for simple graphs with N = M = 400, 
fc = 8, w 30, diamonds: as before but M = 300, triangles: N = 800, M = 600. 




Figure 3.18: Demonstration of self-averaging: 7r(m) for a single network with degree distribution n(fc) <~ 
{(j) - fc) 2 , for two different sizes N ~ M ~ 5000 (thin solid line) and 10000 (dashed line) and fc = 16. 
For each case four networks were generated to estimate error bars. Only a few experimental points are 
shown for brevity. The thick solid line gives the asymptotic distribution. 




Figure 3.19: The experimental distribution of balls averaged over 200 graphs taken from the ensemble 
of random trees and ER graphs. Circles: ER graphs N = 400, M = 300, fc = 8, squares: as before but 
N = 800, M = 450, diamonds: ER with N = M = 400, fc = 16, triangles: trees with N = 800, M = 300. 
Thick lines show power laws ir(m) <~ mT h with b = 4.87 (upper line) and 5.52 (lower line). 
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Figure 3.20: The distribution n(m) obtained from Eq. l|3.138j) with cut Poissonian degree distribution. 
The almost straight line on the log-log plot explains partially the quasi-power-law observed in numerical 
experiments in the figure [37191 

critical point, in the thermodynamical limit. For random trees, the generating function M(z) defined in 
Eq. I|3.140p has a closed form: M(z) = exp(z + e z ), as follows from Eq. (|3. 148(1 . The function fx[rn) is 
given by the inverse Laplace transform: 

fn ' f 

H(rn) = — d> M(z)z- m - 1 dz, (3.149) 
2m J 

which can be evaluated by the saddle-point integration around z « ln(m/lnm): 

In /i(m) = m(lnm — lnlnm) + 0(m), (3.150) 

and hence n(m) grows over-exponentially for large to. The hop rate u{m) for an equivalent homogeneous 
ZRP decays fast with to. This means that the condensation happens for any density of balls, and 
the bulk distribution falls faster than any power-law. Similarly, one can estimate that for random ER 
graphs M(z) oc exp(ke z ) — 1 and hence the leading term in ln/x(m) is also ralnm, so we again have the 
condensation. 

It is clear from the above arguments that in the limit M — * oo one cannot obtain power-law distribution 
of balls for maximally random graphs like ER graphs or random trees. If it is not a power law, how the 
behavior in Fig. 13.191 can be explained? For finite systems, we observe that the power law goes only over 
one-two decades, so it can be easily confused with another function. Such a quasi-power-law behavior is 
presented in Fig. 13.20} where we have calculated 7r(m) for finite-size ER graph by means of Eq. <|3. 138(1 
multiplied by a factor exp(— to/3). In order to mimic such finite-size effects we have assumed that the 
degree distribution II(fc) had a cutoff at some fcmax calculated from the condition that inn = 200 samples 
of graphs of size N — 400 it should be around one node with degree fc max : nNTL(k max ) 1, and hence 
kmax ~ 22. The factor has been chosen to get an almost straight line on the plot. Normally this is done 
by the factor Z(N — 1, M — m)/Z(N, M) in the formula for distribution of balls, and in real simulations 
by taking an appropriate value of M, which brings the system to the critical density p c . 

To summarize the discussion of this subsection, we have shown that tuning the node-degree distribu- 
tion n(fc) on a random network, on which the zero-range process is defined, one can obtain the power 
law in the balls occupation distribution 7t(to). This makes the system scale-free at the critical point, in 
contrast to the previously discussed single-inhomogeneity graphs. The key point is that although degrees 
of nodes differ, their distribution is so tuned that averaging over distributions of balls for nodes with 
different degrees gives exactly the power law. On every single node i, however, the distribution of balls is 
not a power law, but it falls exponentially as (ki/k max ) m . The only exception is the node with maximal 
degree fc max , where the condensation takes place. 

This is not the case for maximal random graphs like ER graphs, where the degree distribution is 
concentrated around k and the ZRP behaves like for a homogeneous system with constant hop rates, 
having an over-exponential decay in 7r(m) for small to and a condensation peak at large m. The situation 
is similar to that for S-F networks, where the node-degree fluctuations are strong enough to produce a 
node of degree much larger than other degrees. This node attracts the condensate [97] . 
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Chapter 4 

Conclusions and outlook 



Complex networks have been widely studied in recent years. Being a discipline on the interface of physics, 
chemistry, biology, social and computer sciences, and others, it applies a variety of methods. Most people 
try to understand observed properties of networks by introducing simplified models and then by making 
computer simulations in order to compare results to real-world data. Some of them use a multitude of 
so-called mean field approaches, when the quantity of interest is assumed to evolve in an averaged field 
of all interactions. This, however, can rarely allow one to examine such effects like phase transitions or 
condensation, and the results can be only qualitative. Moreover, some problems may be ill-posed when 
one does not specify what the word "averaged" means. In this thesis we have tried to present a consistent 
theory of statistical mechanics of complex networks, where all problems can be formulated in terms of 
some averaged quantities over a well-defined statistical ensemble. The starting point of the formulation 
is the ensemble of Erdos-Renyi graphs, where all graphs have the same statistical weight. But we have 
seen that ER graphs weakly reproduce features observed in real-world networks. Therefore, we have 
assigned different statistical weights to graphs from the same set, enhancing the probability of occurrence 
of graphs of a certain type. For instance, by assigning to each node a functional weight p(k) depending on 
its degree k, we have been able to obtain any desired degree distribution, either for simple or degenerated 
graphs, for trees or graphs with cycles, and for causal as well as for equilibrated networks. In particular, 
we can reproduce for equilibrated networks the scale-free degree distribution, one of the most important 
properties of real networks. We have shown also how the approach via statistical ensemble can be used to 
calculate degree-degree correlations or the assortativity coefficient. We have pointed out that the same 
method can be used for growing networks. We have discussed how to reformulate models of preferential 
attachment in the language of network's ensembles and how to relate them to the rate-equation approach, 
which is a very powerful analytical method. 

At the end of discussion devoted to statistical ensembles of graphs we have presented a comparison 
between growing and equilibrated networks. We have shown how to choose functional weights in both 
ensembles, in order to obtain the same power-law degree distributions and not to introduce node-node 
correlations. Then we have focused on some global properties like the assortativity or the diameter. We 
have found that both types of networks are disassortative but that the degree-degree correlation function 
e(k, q) exhibits different behavior for these graphs. We have observed a similar difference for the diameter, 
which scales like In N for growing unweighted networks, also for growing trees, thus indicating the small- 
world behavior, while for equilibrated unweighted trees it grows like ~ N 1 / 2 . In other words, we have 
explicitly shown that graphs in the two ensembles, despite having identical degree distributions, may have 
completely different geometrical properties. In this particular case, the origin of the differences, shortly 
speaking, comes from the fact that the set of causal graphs forms only a small subset of all possible 
graphs in the statistical ensemble, and the properties of that subset are quite different to those observed 
as "typical" for the whole set. 

Further differences between various graphs with the same degree distribution have been discussed in 
Chapter 3. Using analytical and numerical methods we have tested theoretical predictions for the position 
of the cutoff known from the literature on the subject. We have encountered an unexpected difference 
between the values of the exponent a, describing the scaling of the cutoff k max with the network size, for 
causal and equilibrated trees. We have pointed out that the two estimates of k ma x for simple equilibrated 
graphs found in the literature seem to be inconsistent in light of results presented in this thesis. As a 
by-product of this analysis we have developed a method of calculating the cutoff function w(x), which 
allows one to treat many models of growing networks in a unified, standard fashion. 

In the second part of Chapter 3 we have discussed dynamics on networks. We have studied the 
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zero-range process, being just the balls-in-boxes model with a certain type of local dynamics. We have 
shown that for inhomogeneous systems, that is when node degrees differ much from each other, static 
and dynamical properties of the system are different than those for homogeneous systems studied in the 
past. For instance, when the inhomogeneity is strong enough, it triggers the condensation on the most 
inhomogeneous node. The critical density of balls, above which the condensation takes place, depends 
on ki/k, where fci is the largest degree and k is the typical degree in the network. In particular, on 
S-F networks the ZRP always develops the condensation. Another interesting effect of inhomogeneity 
is a qualitative change in the behavior of a typical life-time of condensate, which grows exponentially 
or faster, in contrast to homogeneous systems where it grows only as a power of the system size. We 
have seen also that the effect of inhomogeneity can be weakened for some node-degree distributions. In 
particular, we have found a special form of the distribution II(fc) ~ (0 — fc) fc_1 , for which the system of 
balls behaves very much like on a homogeneous networks at the critical point, where the distribution of 
balls occupation number is scale-free: n(m) ~ vnT h . 

At the end, let us say a few words about possible directions of further studies, and concepts which 
can be an interesting continuation of ideas discussed in this thesis. Among many interesting ideas, it is of 
the primary interest of the author to study dynamical processes taking place on dynamically rearranged 
networks. Suppose that the ZRP can interact with the topology of the underlying network and change 
it while the occupation of nodes is changing. A slow change in the network's structure should be well 
approximated by averaging of the ZRP over an ensemble of static networks as it was done in Sec. 3.2.4. 
If the network evolves quickly in comparison to the characteristic time of the ZRP, its evolution can be 
viewed as a sequence of rewirings as those described in section devoted to Monte Carlo simulations, but 
with additional weights for nodes, depending on the state of the ZRP. It is very interesting to study what 
happens in between, that is when the two characteristic time scales are comparable. 

Another class of problems where this kind of the two-fold evolution becomes important is related to 
neural networks. If one couples the evolution of neuron's states to the evolution of connections between 
them, one observes a self-organized criticality that produces a S-F network and a small- world [105]. The 
question is whether one can mimic this behavior using a simpler model, or to predict it analytically in 
the framework described here. 

There are also many other questions, for instance if one can use the approach via moments of the 
distribution n(fc) to estimate the cutoff in some other models of growing networks, especially with degree- 
degree correlations, or how the properties of causal networks change when, after a certain time, we allow 
for some rewirings that homogenize the network. As the example of Watts and Strogatz's small-world 
model shows, an interesting behavior is possible. We hope to address these and other problems in the 
future investigations. 
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